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THE  HUMAN  COMPUTER  INTERFACE 


INTRODUCTORY  REMARKS 

Albert  N.  Badre 

When  asked  to  sit  down  at  a  computer  terminal  and  perform  what 
is  considered  an  elementary  task,  most  novice  operators  are  likely 
to  be  confused  £ind  frustrated.  Even  the  simplest  of  tasks  seems  to 
require  an  excessive  level  of  computer  sophistication  or  the 
motivation  to  read  and  understand  an  overabundance  of  accompanying 
documentation. 

The  population  of  computer  users  is  growing  at  a  very  rapid 
pace,  and  an  increasingly  large  number  of  this  generation  of  new 
users  is  not  data  processing  or  computer  trained.  Yet, 

-  the  language  that  the  operator  must  use  to  Interact  with 
the  machine 

-  the  documentation,  whether  on-line  or  off-line,  that 
he/she  has  to  read  in  order  to  learn  how  to  instruct  the 
machine;  and 

-  the  system  messages  that  are  displayed 

are  couched  in  the  vocabulary  and  language  habits  of  the  computer 
expert. 
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Accordingly  there  is  a  growing  conL.ansu8  in  the  computer  science 
community  that  the  user-compatibility  of  the  human  interface  should 
be  considered  and  Incorporated  into  the  design  of  all  computer  systems 
at  the  initial  stages  of  development.  "Information  processing" 
systems  are  likely  to  be  more  user  compatible  if  they  are  designed  to 
adapt  to  the  information  processing  capabilities  and  limitations  of 
the  user.  It  is  becoming,  therefore,  increasingly  necessary  to 
explore  and  identify  the  human  information  processing  factors, 
constraints,  and  variables  that  are  associated  with  making  the 
interface  more  user  compatible.  This  means  identifying  and 
considering  factors  relating  to  what  the  operator  "does"  at  the 
display  station  in  order  to  perform  a  desired  task  and  what  the 
system  does  in  return. 

In  this  workshop  symposium  we  will  be  dealing  with  six  inter¬ 
related  topics  that  revolve  around  the  user  Interface  theme.  These 
are:  Modeling  the  user.  Interface  development  factors,  design 
considerations  for  intelligent  and  adaptive  interfaces,  memory 
structures,  the  human  factors  of  language  interaction,  and  messages 
and  displays. 
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Experiences  with  a  Natural  Language 
Interface  to  an  ICAI  System 


Richard  Burton 
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Towards  a  Robust,  Task-Oriented 
Natural  Language  Interface 


Jaime  G.  Carbonell 
Carnegie-Mellon  University 
Pittsburgh,  PA  15213 


Abstract 

This  paper  analyzes  the  inception  of  a  new  generation  of  robust,  task-oriented  natural  language 
interfaces  in  light  of  new  theoretical  advances  and  analysis  to  avoid  limitations  of  previous  efforts. 
Three  key  ideas  are  discussed:  1)  dynamic  selection  of  parsing  strategies,  2)  exploiting  domain- 
specific  semantics  and  grammatical  constructions,  and  3)  integrating  recent  theoretical  findings  into 
task-oriented  parsing.  An  implemented  natural  language  interface  conforming  to  some  of  the  new 
objectives  is  discussed,  as  are  current  plans  for  a  more-general-scope  natural  language  interface. 


3  February  1981 
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Towards  a  Robust,  Task-Oriented 
Natural  Language  Interface 


Jaime  G.  Carbonell 
Carnegie-Mellon  University 
Pittsburgh,  PA  15213 


1.  Objectives  and  Historical  Perspective 

Natural  language  comprehension  has  been  studied  from  two  primary  perspectives  in  Artificial 
Intelligence: 

•  As  a  vehicle  to  investigate  and  simulate  human  cognitive  processes  embodying 
components  of  either  a  linguistic  or  psychological  theory  of  language  comprehension. 

•  As  a  means  of  implementing  task-oriented  "natural  language  front  ends"  to  complex 
computer  systems. 

The  "basic  science"  approach  has  produced  some  significant  principles  and  techniques  (e.g., 
expectation-based  language  analyzers  [7, 1]),  but  no  truly  robust  parsers  for  computer-naive  users 
have  been  developed  in  this  paradigm. 

The  applied  "engineering"  approach  has  proceeded  by  either  building  the  domain  of  application 
into  the  parser  itself,  or  by  relying  on  syntax-only  linguistic  parsers.  Neither  approach  has  proven 
wholy  satisfactory.  The  former  suffers  from  virtual  lack  of  transferability  to  new  domains,  while  the 
latter  suffers  from  extreme  fragility:  the  insd3ility  to  cope  with  any  input  not  strictly  conforming  with  its 
rigid  internal  grammar.  However,  it  must  be  noted  that  some  successful  parsers  have  emerged  from 
these  limited  approaches,  such  as  LIFER  [5]  and  LUNAR  [8].  Both  of  these  efforts,  unfortunately, 
required  man-years  of  development  and  tuning  before  their  performance  approached  the  user- 
acceptance  level.  Their  primary  contributions  were  in  the  computational  mechanisms  they 
introduced,  which  could  later  be  incorporated  into  more  sophisticated  parsers. 

A  major  objective  in  the  design  of  task-oriented  parsers  is  to  provide  the  user  maximal  flexibility 
(within  the  semantics  of  the  domain)  to  express  his  utterance.  For  example,  the  graceful  interaction 
project  [4]  is  a  recent  attempt  at  coping  with  limited  ungrammaticality  in  a  task-oriented  parser.  The 
means  by  which  recent  task-oriented  parsers  strive  for  robustness  and  flexibility  is  to  incorporate 
domain  semantics  into  their  parsing  knowledge  bases  (but  not  into  the  programs  themselves).  Here, 
we  go  one  step  further  and  exploit  domain  knowledge  to  dynamically  choose  the  optimal  parsing 
strategies.  Moreover,  the  work  described  in  this  paper  attempts  to  take  ful!  advantage  of  lessons 
learned  from  more  theoretical  natural  language  research.  Our  objectives  can  be  summarized  as 
follows: 
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•  Create  a  robust  parser,  in  the  sense  that  it  must  tolerate  common  ungrammaticality, 
ellipsed  constructions,  and  different  phrasings  within  its  domain  of  application. 

•  Implement  the  parser  in  a  modular  manner  with  respect  to  its  knowledge  sources.  This 
means  that  domain  knowledge  necessary  for  the  parser  ought  to  be  divorced  from  the 
program,  from  general  semantic  knowledge,  and  from  linguistic  knowledge.  Hence,  only 
one  knowledge  base  need  be  altered  in  transfering  the  parser  to  a  new  application 
domain.  The  program  itself  is  general  with  respect  to  the  choice  of  task  domain. 

•  Exploit  new  advances  in  natural  language  processing  not  previously  incorporated  into 
task-oriented  parsers.  Some  well-established  powerful  methods  developed  to  simulate 
human  language  understanding  (most  notably  expectation-based  disambiguation)  have 
not  previously  been  used  in  task-oriented  approaches,  although  they  have  proven 
computationally  effective  in  more  general  domains. 

•  Minimize  the  time  required  to  transfer  the  parser  to  a  new  domain.  This  goal  is  furthered 
by  our  modularity  consideration,  but  in  addition  I  want  to  work  towards  a  uniform  method 
of  incorporating  new  domain  knowledge,  including  knowledge  of  technical  jargon 
particular  to  a  given  domain. 


In  order  to  further  these  ends  I  developed  an  initial  parser  that  combines  partial  pattern  matching, 
semantic-grammars  [5]  and  equivalence  transformations.  I  applied  this  parser  to  the  task  of  building 
and  querying  a  semantic-network  [2]  data  base.  The  central  lesson  learned  from  this  exercise  is  that 
the  combination  of  the  three  parsing  strategies  yields  not  only  a  more  robust  parser  than  a  single¬ 
strategy  method,  but  surprisingly  the  time  it  took  develop  its  domain  application  (admittedly  not  a  very 
complex  task)  was  considerably  less  than  expected  (less  than  three  weeks). 


A  crucial  (and  perhaps  unintuitive)  fallacy  of  previous  task-oriented  parsers  is  their  commitment  to  a 
simple  uniform  parsing  strategy.  Since  natural  language  is  a  complex  phenomenon  (even  in  task- 
oriented  domains),  this  design  criterion  had  the  effect  of  pushing  the  complexities  into  the  domain 
'  grammars,  dictionaries  and  other  domain-specific  components  of  the  parser.  In  the  clearer  vision  of 

hindsight,  this  design  decision  greatly  complicated  the  application  of  existing  parsers  to  new 
domains.  Is  it  not  more  desirable  to  incorporate  all  the  decision-making  complexities  required  to  parse 
natural  language  structures  into  the  kernel  program  itself?  Once  built,  this  program  need  not  be 
redesigned  for  a  new  task  domain.  Minimizing  the  requisite  complexity  and  size  of  domain-dependent 
components  is  an  extremely  productive  venture.  Parsing-strategy  selection,  semantic  matching 
routines,  and  other  domain-independent  components  should  be  provided  as  a  kernel  parser,  which  is 

(  augmented  by  domain-specific  knowledge  bases  in  each  applications  domain. 

i 

1 

In  designing  the  kernel  parser,  a  dominant  criterion  is  that  it  select  the  parsing  strategy  in 
accordance  with  the  type  of  natural  language  construct  it  attempts  to  parse.  Some  information  can  be 
expressed  more  naturally  and  more  parsimoniously  in  one  form  (e.g.,  linear  patterns)  while  other 
information  is  best  expressed  as  case  structures,  equivalence  transformations,  or  semantic  grammar 
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productions.  To  illustrate  this  point,  I  attempted  to  encode  all  the  knowledge  in  my  parser  as  a  pure 
semantic  grammar.  This  task  has  more  than  tripled  the  size  of  the  task-specific  knowledge  base,  and  I 
have  not  yet  finished  (nor  do  I  intend  to  finish)  the  conversion.  The  primary  reason  for  the  relative 
increase  in  size  is  that  much  of  the  information  must  be  stated  with  a  high  degree  of  redundancy  and 
often  in  an  awkward,  round-about  manner  when  it  must  be  coerced  into  a  uniform,  context-free 
representation. 


2.  The  DYPAR  Parser 

DYPAR^  combines  three  parsing  strategies: 

•  A  context-free  semantic  grammar  component,  grouping  domain  information  into 
hierarchical  semantic  categories  useful  in  classifying  individual  words  and  phrases  in  the 
input  language. 

•  A  partial  pattern  match  component,  represented  as  pattern-action  rules.  The  patterns 
may  contain  individual  words,  semantic  categories  (from  the  semantic  grammar),  wild 
cards,  optional  constituents,  register  assignment  and  register  reference.  This  method 
enables  the  semantic  grammar  non-terminal  categories  to  be  applied  in  a  much  more 
effective  context-sensitive  manner  than  would  be  the  case  is  a  pure  context-free  grammar 
recognizer. 

•  Equivalence  transformations  map  domain-dependent  and  domain-independent 
constructs  into  canonical  form,  requiring  a  fraction  of  the  patterns  and  semantic 
categories  that  would  otherwise  be  necessitated.  If  a  phrase-structure  can  be  expressed 
in  several  different  ways,  while  retaining  the  same  meaning,  it  is  clearly  beneficial  to  first 
map  it  into  canonical  form,  rather  than  being  forced  to  include  all  possible  variants  in 
every  context  where  that  constituent  could  occur. 

Below  I  give  an  example  of  each  type  of  linguistic  information  used  in  DYPAR.  In  order  to 
understand  these  examples,  a  few  notational  conventions  must  be  introduced:  <BR  ACKETS>  denote 
a  non-terminal  semantic  grammar  symbol.  A  word  starting  with  an  exclamation  mark  (e.g., 
'.REGISTER)  denotes  the  name  of  register.  A  vertical  bar  (|)  denotes  disjunction  in  a  pattern.  A  #  in 
a  pattern  matches  a  single  word.  An  asterisk  (*)  matches  an  arbitrary  sequence  of  words.  The 
construction  (IREGISTEP  pattern)  assigns  whatever  matches  the  pattern  to  the  register  specified.  A 
colon  (:)  before  a  constituent  in  a  pattern  indicates  that  constituent  is  optional. 

DYPAR,  as  we  see  in  the  dialog  below,  is  the  front  end  of  a  semantic  network  data-base  update  and 
query  system.  Therefore,  its  domain  knowledge  consists  of  language  constructs  relevant  to  this  task. 
First,  consider  a  fragment  of  its  semantic  grammar: 
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Robust  multi -strategy  "OYnamic  PARsing”  is  still  in  its  infant  stages,  requiring  frequent  changes. 
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<INFO*REQ>  ->  (<VIHAT-Q>  |  <INF0-REQ1>] 

<INF0-REQ1>  ->  (:  <POLITE>  <INF0-REQ2>  :  <WHAT-Q>] 

<INF0-REQ2>  ->  (TELL  <me"US>  :  ABOUT  |  GIVE  <me-US>  |  PRINT  |  TYPE  ] 

This  fragment,  together  with  the  rewrite  rules  for  the  other  non-terminals  above  (e.g.,  <BE-PRES>, 
whose  rewrite  is  all  the  present-tense  conjugations  of  the  verb  "to  be")  recognizes  the  initial  segment 
of  information-request  queries  such  as;  "What  is "Tell  me  what  is "Tell  me  about...",  "Would 
you  give  me ...",  etc. 

Now,  consider  a  pattern-match  rule: 

(:  <dot>  (Ival  #)  <be-pre8>  :  <DET>  (I PROP  0)  OF 
:  <DET>  (INAM  0)  :  <dpunct>) 

■> 

(LTM-STORE  INAM  IVAL  IPROP) 

This  rule  recognizes  sentences  such  as;  "Felix  is  a  friend  of  Fido",  or  "Reagan  is  president  of  the 
USA",  and  passes  the  information  to  the  data  base  manager  for  consistency  checking  and  storage.  In 
order  to  pass  the  information  gathered  in  the  pattern  match  process,  the  registers  are  assigned 
appropriate  values.  For  instance,  in  the  second  example,  INAM  is  assigned  "USA",  IPROP  is  assigned 
"president"  and  IVAL  is  assigned  "Reagan". 

The  equivalence  transformations  also  use  the  pattern  matcher.  For  instance,  consider  the  following 
simple  (but  useful)  transformation; 

((ISl  •)  (IVIl  0)  <P0SS>  (IW2  0)  (IS2  •)  :  (IP  <PUNCT>)) 

(NCONC  ISl  IW2  (LIST  ’OF)  IWl  IS2  IP) 

This  transformation  maps  possessive  constructions  into  attribute-value  constructions,  which  we 
chose  as  canonical.  For  instance  "Tell  me  about  the  VAX-7Lo's  performance."  is  mapped  into  "Tell 
me  about  the  performance  of  the  VAX-785."  fhe  latter  construction  is  recognized  by  a  pattern-action 
rule.  Since  possessive  constructions  can  occur  in  many  contexts,  the  single  transformation  above 
saves  us  from  duplicating  pattern  match  rules  for  each  different  context  where  an  attribute-value 
construction  can  occur. 

The  control  structure  of  DYPAR  Is  essentially  an  applicative  condition-action  cycle,  which  halts 
upon  no  rules  being  applicable  (in  which  case  a  partial  pattern-match  strategy  is  attempted),  or  upon 
one  of  the  pattern-action  rules  firing  and  passing  its  information  to  the  data-base  manager.  Let  us  see 
a  brief  dialog  with  DYPAR.  For  simplicity,  we  start  out  with  an  empty  data  base.  Items  in  italics  below 
were  typed  in  by  the  user.  Everything  else  was  output  by  the  system.  Items  in  UPPER  CASE  are 
paraphrases  of  internal  semantic  network  relations  output  by  a  rudimentary  English  generator. 

*  (parser) 

You  can  build  and  query  a  simple  semantic  network 
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by  stating  information  and  questions  in  Engli'^li. 

+Fido  is  a  nice  dog. 

Storing  assertion  in  semantic  net:  FIDO  IS  DOG. 

[COMKENT:  When  DYPAR  does  not  recognise  coiicepls  in  the  input.  U  prompts 
the  user  for  additional  information  required  by  the  integrity-checker 
in  the  data-base  manager.] 


What  is  NICE? 

+A//ce  is  a  disposition. 

Storing  assertion  in  semantic  net:  NICE  IS  DISPOSITION. 

Is  it  correct  to  say  that  THE  DISPOSITION  OF  FIDO  IS  NICE? 
(Y.N):  yes. 

Adding  new  assertion:  THE  DISPOSITION  OF  FIDO  IS  NICE. 

+Fido's  cousin  is  Felix. 


Transforming  (FIDO  $S  COUSIN  IS  FELIX  PERIOD) 

Into  ::>  (THt  COUSIN  OF  FIDO  IS  FfLiX  PERIOD) 


Transforming  (THE  COUSIN  OF  FIDo  IS  tEilX  PERiOD) 
Into  ::>  (FELIX  IS  THE  COUSIN  01  FIDO  PERIOD) 


Adding  new  assertion;  1HF  COUSIN  OF  FIDO  IS  FELIX. 


[COMMtNT:  To  illi'Slrate  the  parsing  process  in  a  bn  more  detail, 
the  rule  tracer  was  activated  for  liie  (ollo».:iig  nuaract  ion.  ] 


+Cousin  is  a  synimetnc  relation. 

Parse  is; 

Rule  RUL19 

Action  will  be:  (PROGN  (LTM-STORF  !NAM  !VAL  (QUOTE  ISA:)) 

(LTM-SPEC  !NAM  iVORP  NIL  NIL  T)) 

With  bindings: 

(IVAL  RELATION) 

(IVORP  SYMMETRIC) 

(INAM  COUSIN) 

I  already  knew  that:  COUSIN  WAS  A  RELATION. 

Is  it  correct  to  say  that  THE  RFLATION  OF  COUSIN  IS  SYMMETRIC? 
(Y.N):  yes. 

Adding  new  assertion;  THE  RELATION  OF  COUSIN  IS  SYMMETRIC. 
Inference:  Adding  new  assertion:  THE  INVERSE  OF  COUSIN  IS  COUSIN. 

Inference:  Adding  new  assertion:  THE  COUSIN  OF  FELIX  fS  FIDO. 


*Ciit  cbar.ing  k,  a  dog’s  hobby. 


Transforming  (CAl -CHASING  IS  A  DOC.  5S  HOBBY  PERIOD) 
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Into  ::> 


(CAT-CHASING  IS  A  HOBBY  OF  DOG  PFRIOD) 


Adding  new  assertion:  THI  HOBBY  OF  DOG  IS  CAT -CHASING. 

ICOMHtNJ-  Since  the  program  never  heard  of  "dog"  or  “f  ido" 
it  does  not  know  whether  to  express  them  with  or  withoot  a  deteiminer. 
"Fido"  is  a  proper  name,  therefore  it  needs  no  determiner,  but  “dog* 
in  all  the  examples  above  looks  strange  without  one-  Now.  we  fix  that 
problem  by  providing  the  program  with  the  necessary  information  (in 
English,  of  course).] 


+The  concept  of  dog  is  generic. 

Transforming  (THE  CONCEPT  OF  DOG  IS  GhNERJC  PERIOD) 

Into  ::>  (GENERIC  IS  THE  NODF.-TYPt  OF  DOG) 

(COMMENT:  In  our  domain  "concept”  and  "node-type"  are  synonyms  In 
future  output,  "dog"  will  be  printed  with  a  detc'ii.oer  *rien  appropriate  ] 


Adding  new  assertion:  THE  NODf-TYPE  OF  DOG  IS  GENERIC. 

+Cou/d  you  tell  me  what  a  dog's  hobby  is'^ 


Transforming  (COULD  YOU  TELL  ML  WHAT  A  DOC  SS  HOBBY  IS  QMARK) 
Into  ::>  (COULD  YOU  TELL  ME  WHAT  IS  A  DCf,  $S  HOBBY  QMARK) 


Transforming  (COULD  YOU  TELL  ME  WHAT  IS  A  DOG  SS  HOBBY  QMARK) 

Into  ::>  (COULD  YOU  TELL  ME  WHAT  IS  THE  HOBBY  OF  A  DOG  QMARK) 


THE  HOBBY  OF  A  DOG  IS  CAT-CHASING. 


+give  me  everything  you  know  about  fido 
FIDO  IS  A  DOG. 

THE  HOBBY  OF  FIDO  IS  CAT-CHASING. 
THE  COUSIN  OF  FIDO  IS  FFLIX. 

THE  DISPOSITION  OF  FIDO  IS  NICE. 


+Napping  is  tne  hobby  of  Fido. 


That  contradicts  what  I  could  infer  Oy  inter  it<.'ice. 

THE  HOBBY  OF  FIDO  WAS  CAI-CHASING. 

Should  I  add  the  assertion  anyway?  (Y.K):  no. 

OK.  discarding  new  assertion. 

+F.xit  this  program. 

Leaving  natural  language  interface.  Back  to  LISP. 
(CPU-SECONDS:  12.056  GC-TIME:  6.780) 


As  vje  see  in  the  above  example,  robust  corn.munication  with  the  user  requires  not  only  a  flexible 
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domain-oriented  parser,  but  also  an  interactive  query  capability  and  a  natural  language  gener<<{or. 
However,  the  latter  two  processes  are  conceptually  simpler,  and  not  the  topic  of  this  paper. 

3.  Future  Directions 

DYPAR  illustrates  the  harmonious  integration  of  three  parsing  strategies.  However,  it  is  only  the  first 
step  in  exploiting  the  multi-strategy  approach  to  develop  real-world,  robust,  natural  language 
interfaces.  In  terms  of  sophistication,  DYPAR  straddles  the  boundary  between  an  advanced  toy  and  a 
rudimentary  real-applications  system.  One  direction  of  continued  development  is  to  enhance  the 
pattern  matcher,  build  additional  general  transformations,  and  create  a  sub-interface  to  facilitate 
extensions  to  the  grammar  by  a  domain  expert  (not  necessarily  a  natural-language  expert).  A  first  step 
in  the  direction  of  automating  and  simplifying  user  extensibility  has  been  taken  in  the  development  of 
the  KLAUS  system  [6].  At  CMU,  we  are  focusing  on  a  complementary,  and  perhaps  more  fundamental 
research  direction. 

if  the  gestalt  performance  of  integrating  three  parsing  strategies  has  proven  more  effective  than  the 
application  of  any  single  strategy,  why  not  extrapolate  this  result  to  include  additional  parsing 
strategies?  Indeed,  we  have  designed  a  flexible  control  structure  for  integrating  case-instantiation  as 
the  central  parsing  strategy  --  calling  upon  other  strategies  discussed  in  this  paper,  in  addition  to 
more  domain-specific  strategies,  when  appropriate  [3].  Case-frame  instantiation  is  the  most  general 
parsing  strategy  capable  of  exploiting  domain  semantics.  Hence,  it  should  provide  a  quantum  jump  in 
the  general  applicability  of  our  task-oriented  parser.  Moreover,  techniques  such  as  expectation-driven 
disambiguation  [7, 1]  developed  by  the  non-applied  school  of  natural  language  processing,  can  now 
be  brought  to  bear  in  real-world  applications.  The  reason  why  case-frame  parsers  have  not  been 
developed  in  task-oriented  domains  is  that  while  they  capture  general  principles  admirably,  they  fail  to 
recognize  specific  idioms,  compound  nouns  and  the  like.  However,  the  addition  of  partial  pattern 
matching  (idealiy  suited  to  detect  idiomatic  expressions)  integrated  with  case-frame  instantiation  and 
other  parsing  methods  should  provide  a  high  degree  of  generality  without  sacrificing  robustness. 

Graceful  interaction  with  the  user  is  a  worthy  goal  for  any  natural  language  front  end  whose  users 
may  be  computer-naive.  People  invariably  produce  ungrammatical  utterances,  leave  out  words,  add 
interjections,  and  use  terms  outside  the  vocabulary  of  any  system  [4].  It  is  essential  that  a  real-world 
system  "fail  soft"  in  such  circumstances,  and  interact  with  the  user  to  enable  graceful  recovery.  We 
saw  some  simple  examples  of  this  in  DYPAR.  However,  the  expectation-setting  provided  by  a  case 
system  incorporating  domain  knowledge  can  be  a  more  powerful  tool  to  minimize  failure. 

Consider,  for  instance,  a  file-management  system  where  a  user  may  type  "Transfer  the  flies  in  my 
directory  to  the  accounts  directory."  It  is  fairly  clear  to  us  humans  that  the  user  meant  to  type  "files", 
even  if  we  know  perfectly  well  that  "flies"  is  a  legitimate  word  in  our  vocabulary.  A  case-frame  system 
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knows  that  the  objective  case  in  the  transfer  imperative  (as  applied  to  the  file-management  domain) 
requires  a  logical  data  entity,  which  "flies"  is  not.  Realizing  this  violated  semantic  requirement,  It  can 
proceed  to  see  whether  by  spelling  correction,  morphological  decomposition,  or  detecting  potential 
omissions  it  can  map  "flies"  into  a  known  filler  of  that  case.  Here,  spelling  correction  works,  and  the 
system  can  proceed  to  inform  the  user  of  its  correction  (allowing  the  user  to  override  if  need  be). 

I  conclude  by  reiterating  my  central  theme:  Integration  o1  multiple  parsing  strategies  is  perhaps  the 
single  most  powerful  principle  in  the  development  of  robust,  task-oriented  natural  language 
interfaces. 
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CREATING  AN  ALGORITHM  FOR 
GENERATING  ABBREVIATIONS  TO  BE  USED 
IN  USER-COMPUTER  TRANSACTIONS 

Sam  Ehrenreich 

US  Army  Research  Institute  for  the 
Behavioral  and  Social  Sciences 

The  US  Army  is  in  the  process  of  developing  automated  tactical  systems. 
These  systems  will  incorporate  a  dialogue  mode  (e.g.,  form-filling,  menu,  query 
language)  for  communicating  between  the  user  and  the  computer.  For  the  con¬ 
venience  of  both,  much  of  this  communica*‘ion  will  involve  abbreviations.  The 
Army  Research  Institute  (ARI)  is  engaged  in  preparing  an  algorithm  for  use  by 
system  designers  in  creating  easy  to  use  abbreviations  for  these  systems.  The 
algorithm  will  not  only  be  concerned  with  generating  abbreviations  for  command 
terms.  Rather,  the  primary  domain  of  the  algorithm  will  be  the  lexical  terms 
used  in  exchanging  information  between  the  user  and  the  computer. 

This  summary  describes  the  empirical  issues  that  were  investigated  in  ARl’s 
abbreviation  project.  The  data  that  was  collected,  along  with  an  algorithm  for 
generating  abbrevi'tions,  will  be  presented  at  the  workshop. 

All  of  the  experiments  for  this  project  have  already  been  completed. 
However,  a  few  still  remain  to  be  analyzed.  The  participants  used  in  these 
experiments  were  enlisted  Army  personnel.  The  stimuli  used  were  words  which  are 
likely  candidates  for  abbreviation  on  an  automated  tactical  system.  However,  it 
is  believed  that  the  nature  of  both  the  participants  and  the  stimuli  are  such 
that  the  resulting  algorithm  will  be  applicable  for  use  with  most  classes  of 
operators  and  with  most  sets  of  words. 

The  general  abbreviation  techniques  which  were  considered  as  candidates  for 
forming  the  basis  of  the  algorithm  are:  (1)  truncation,  i.e.,  delete  all  but  the 
first  few  letters  of  a  word;  (2)  contraction,  i.e.,  remove  all  of  the  word's 
vowels  except  for  vowels  occurring  as  the  first  letter;  and  (3)  abbreviation 
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by  the  consensus  of  a  committee.  In  order  to  create  the  desired  algorithm,  the 
empirical  questions  which  were  investigated  are: 

1.  What  are  people's  personal  preferences  with  regard  to  the  abbreviations 
formed  by  the  different  abbreviation  techniques? 

2.  How  do  the  different  abbreviation  techniques  compare  when  participants  are 
presented  with  a  word  and  asked  to  recall  its  abbreviation  (i.e.,  encoding)? 

How  do  the  methods  compare  when  the  task  is  decoding? 

3.  When  participants  are  asked  to  produce  abbreviations  of  their  own  choosing, 
what  abbreviation  method  do  they  tend  to  naturally  use? 

4.  When  participants'  experiences  with  a  word  and  its  abbreviation  increases, 
do  the  absolute  and  relative  effectiveness  of  the  different  abbreviation  tech¬ 
niques  change? 

5.  VHien  participants  are  instructed  in  the  rule  system  underlying  the  different 
abbreviation  techniques,  do  the  absolute  and  relative  effectiveness  of  the 
abbreviations  change? 

6.  Should  abbreviations  be  of  a  fixed  or  variable  length? 

7.  How  can  different  words  that  result  in  identical  abbreviations  be  handled 
(e.g.,  when  using  the  truncation  method,  both  TRANSLATOR  and  TRANSPORT  are 
abbreviated  as  TRAN)? 

8.  Can  endings  (e.g.,  -ed,  -ing)  be  effectively  incorporated  into  abbreviations? 

The  answers  to  these  questions  will  represent  the  empirical  basis  on  which 
an  abbreviation  algorithm  is  formed.  Tlie  desired  algorithm  is  one  which  is 
completely  deterministic  in  the  abbreviations  it  forms.  Using  the  algorithm, 
the  system  designer  should  have  minimum  input  in  determining  the  abbreviation  to 
be  created.  Although  the  algorithm  that  will  be  created  will  not  be  based  on  a 
complete  Investigation  of  all  possible  variables,  it  is  expected  that  it  will 
result  in  abbreviations  which  are  significantly  easier  to  use  than  the  arbitrary 
and  inconsistent  abbreviations  presently  used  on  Army  systems. 
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1ool«  for  the  D«signcr«  of  Usor  Intorfacot 


Our  research  objective  i«  to  develop  methodologies  and 
tools  uhich  can  aid  in  the  design  of  user-computer 
interfaces.  Ue  want  to  impose  structure  on  the  typically 
very  complex  task  of  designing  a  user-computer  interfacei  so 
the  design  can  be  divided  into  manageable  pieces>  each  of 
which  can  be  dealt  with  in  a  systematici  rigorous  and  at 
least  partially  quantitative  way.  We  believe  this  will  nelp 
make  User  Interface  Design  more  of  a  science  and  less  of  an 
arti  and  lead  to  improved  design. 

7he  actual  process  of  designing  a  user  interface  can  be 
accomplished  as  four  major  steps>  which  we  call  the 
conceptual!  semantic>  syntactic*  and  lexical  design  steps. 
Each  step  can  be  dealt  with  in  sequence*  one  after  the 
other*  with  an  occasional  reexamination  of  a  previous  step. 
We  call  these  four  steps  a  design  framework. 

The  Design  T-ramework 

The  coriceptual  design  is  the  definition  of  the  key 
application  concepts  which  the  user  of  the  interface  must 
understand  in  order  to  use  the  system.  For  a  simple  text 
editor*  the  key  concepts  are  fil^s*  lines  of  a  file,  and 
operations  (add*  delete*  move)  on  lines.  The  conceptual 
model*  as  in  this  case*  typically  defines  objects*  relations 
between  objects  (a  litie  is  in  a  file)*  and  operations  on  the 
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objscts*  and  s»t«  tha  stag*  for  the  semantic  design  of  the 
user-computer  interface. 

The  semantic  design  deals  uih  the  functionality  of  the 
system  to  be  accessed  via  the  intermediary  of  the  user 
interface.  The  user  performs  certain  actions# 
calculations/processing  ensues#  and  information  is  presented 
to  the  user.  At  the  semantic  design  level  ue  are  concerned 
only  with  the  meanings  of  the  inputs#  the  processing#  and 
the  outputs:  we  are  not  concerned  with  the  form  or  the 
sequence  of  the  inputs  and  outputs. 

The  syntactic  design  deals  with  the  sequence  of  the 
inputs  and  outputs.  f-or  the  input#  sequence  is  akin  to 
grammar — the  rules  by  which  sequences  of  words  in  a  language 
are  formed  into  legitimate  sentences.  The  types  of  words  in 
an  input  sentence  are  typically  commands#  quantities#  names# 
coordinates#  or  arbitrary  text.  As  in  English#  the  words 
are  the  units  of  meaning  in  the  input  and  cannot  be  further 
decomposed  without  losing  their  meaning.  to  include  the 
spatial  domain  as  well.  Therefore  the  output  syntax 
includes  the  2D  or  3D  organization  of  a  display  as  well  as 
any  temporal  variation  in  the  form.  The  "words'*  in  the 
output  sequence#  by  analogy  to  the  input  sequence#  represent 
the  units  of  meaning  being  conveyed  from  the  computer  to  the 
user.  The  units  of  meaning  are  often  conveyed  graphically  as 
symbols  and  drawings  made  up  of  lines#  curves#  and  points 
rather  than  as  words  made  up  of  letters. 
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The  lexical  design  determines  hou  words  in  the  input 
and  output  are  actually  formed  from  the  available  hardware 
capabil ities.  For  input#  this  involves  designing  the 
interaction  techniq.ues  for  the  application.  An  interaction 
technique  is  a  way  of  using  a  physical  input  device  (tablet, 
keyboard,  mouse,  etc.  >  to  input  a  certain  type  of  word 
(command.  value.  coordinates.  etc.  ).  For  example,  some  of 
the  interaction  techniques  for  command  specification  are 
selection  From  a  menu  with  a  liht  pen  or  with  a  cursor 
controlled  by  a  mouse,  typing  of  the  command  name  on  a 
keyboard,  and  speaking  the  name  of  the  command  into  a  speech 
recognizer. 

For  output.  lexical  design  means  forming  the  symbols 
and  shapes  which  are  to  be  presented  to  the  user,  using  the 
available  hardware  lexemes.  For  text  output,  this  reduces 
to  selecting  text  attributes  such  as  font.  size.  color, 
background  color:  the  spelling  (i.e. .  combination  of 
hardware  lexemes,  the  character  set)  of  words  is  already 
defined  in  the  dictionary.  In  other  cases.  such  as 
situation  displays,  the  symbols  used  must  be  designed  and 
composed  from  lexemes  such  as  lines  and  other  grahics 
primitives,  and  the  symbols  must  be  assigned  attributes  such 
as  color,  intensity,  linestyle.  and  size. 

The  nub  of  this  Four“level  framework  for  design  are 
found  in  formal  language  theory;  the  framework  has  been 
successively  refined  and  reported  in  a  series  of  papers 
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CF0LE74,  FOLETS.  FQLE80.  FOLESlb]. 


We  have  uiorked/are 


working  with  this  framework  in  several  ways:  the 
organizatin  of  design  principles*  the  evaluation  of  existing 
user-cof«,<oter  interfaces*  the  evaluation  of  interaction 
tachniq,ues  (which  are  the  lexical*-level  design  of  the 
input)*  the  formal  specification  of  the  syntactic  and 
lexical  design  of  input  and  output*  the  calculation  of 
metrics  of  “goodness"  based  on  the  formal  specification*  and 
the  design  of  an  "abstract  interaction  handler"  to  remove 
much  of  the  syntactic  and  lexical  design  from  the 
application  program. 

Organizing  Design  Principles 

The  past  ten  years  have  seen  several  user  interface 
designers  setting  forth  their  design  principles  CBEr4N76* 
BRITT77*  ENOeyS*  HANS71*  WALL76a  in  the  form  of  general 
objectives  and  specific  do^s  and  dont's.  These  papers  plus 
personal  experience  form  the  knowledge  base  available  to 
most  designers.  Often  the  criteria  are  soundly-based:  a 
useful  start  in  developing  tool&  for  designers  is  to 
organize  the  principles*  showing  how  they  apply  at  the 
conceptual*  semantic*  syntactic*  and  lexical  design  levels. 
This  process  has  been  partially  completed*  as  reported  in 
FOLESlb*  for  principles  dealing  with  feedback*  error 
correction.  response  time*  consistency.  and  display 
structure. 

Evaluating  Dser— Computer  Interfaces 
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Given  sn  organised  set  of  design  criteriai  it  is 
possible  to  perform  a  systematic  evaluation  of  existing 
user-computer  interfaces  by  a  combination  of  watching  others 
use  the  interface  and  learning  to  use  the  interface 
oneself.  In  this  process  it  is  critical  to  note 
idiosyncratic  features  of  an  interface  when  they  are  first 
encountered*  lest  one  adjust  to  the  features.  Two  such 
evaluations  have  thus  far  been  conducted:  the  first 
CHERB803  of  DIDS*  the  Decision  Informatin  Display  System 
used  by  the  federal  government  for  policy  studies;  the 
second  CBLES811  of  SEEDIS*  the  Socio-Economic  Environmental 
Demographic  Information  System  developed  at  Lawrence 
Berkeley  Labs.  A  third  evaluation  will  be  of  a  new 
user-interface  design*  prior  to  its  implementation*  for 
Battelle  Northwest  Labs'  ALDS  (Analysis  of  Large  Data  Sets) 
system. 

Evaluation  of  Interaction  Techniques 

Recall  that  an  interaction  technique  is  a  way  of  using 
a  physical  input  device  to  input  a  word*  and  hence  is  the 
lexical  level  input  design.  In  EOLESla  we  have  described 
and  organized  the  interaction  techniques  by  their  purpose* 
which  can  be  to  make  a  selection*  designate  a  position* 
orientation*  or  sequence  of  positions  and  orientations* 
input  a  value*  or  input  a  character  string.  A  number  of 
germane  human  factors  design  issues  have  been  identified  for 
the  techniques  by  drawing  on  the  literature  and  the 
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guid«lines  mentioned  above.  Nine  experiments  dealing  with 
interaction  technjq,ues  are  also  critically  reviewed.  A 
method  of  interaction  technique  diagrams  is  created#  to  aid 
in  understanding#  analyzing#  and  dotumenting  the  techniques 
and  experiments.  A  diagram  showi^  the  cognitive#  motor#  and 
perceptual  steps  which  the  user  of  a  technique  performs. 
The  report  is  meant  as  a  guide  to  aid  designers  in  selecting 
appropriate  interaction  techniques  and  devices. 
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Formal  Spec  if icat j on  and  Metric* 


Ihe  syntactic  and  lexical  de^^igns  of  a  user  interface 
should  be  describable  by  formal  language  tools*  in  the 
spirit  <but  not  necessarily  in  the  image)  of  BNF*  regular 
expressions*  and  flou  expressions.  We  are  developing  formal 
tools  for  describing  both  the  input  and  output  of  a  user 
interface*  as  uiell  as  the  relationship  betueen  input  and 
output.  1he  input  definition  deals  with  concepts  such  as 
token  types  (which  arc  the  purposes  of  interaction 
techniques*  as  described  above)*  sequences  of  tokens*  and 
the  binding  of  tokens  to  sequences  of  actions  wth  physical 
devices.  The  output  definition  deals  with  concepts  such  as 
screen  areas  and  their  contents*  and  attributes  (such  as 
color*  font*  and  lincstyle)  of  tokens  within  various  areas. 
Metrics  treat  issues  such  as  complexity  and  consistency  of 
syntactic  rules*  cortsistency  in  the  use  of  codings* 
continuity  of  visual  attention  on  the  display*  continuity  of 
tactile  motion  with  the  interaction  devices*  and  time 
required  to  input  commands.  The  metrics  draw  upon  the 
guidelines  mentioned  above. 

The  designer  of  a  user  interface  will  use  the  tools  to 
describe  the  interface.  This  in  itself  helps  create  a  more 
disciplined  design  environment.  In  addition*  the  formal 
definition  will  be  processed*  metrics  evaluated*  and 
potential  design  problems  flagged  for  further  attention  by 
the  designer.  In  the  long  run*  the  user  interface  definition 
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will  b«  input  to  an  interaction  handler  which  will  actually 
implement  the  user  interface. 
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Abstract  Interaction  Handler 


Writing  an  interactive  application  program  involves 
coding  the  semantict  sgntactici  and  lexical  designsi 
typically  using  FQRIRAN1  PASCAL <  or  a  similar  language. 
There  are  tuo  problems  with  this.  First/  the  procedural 
languages  are  not  we-ll-'suited  to  programming  the  syntactic 
and  lexical  designs.  Secondly/  it  is  easy  to  intertwine  the 
code  which  implements  each  of  the  three  levels/  making  later 
changes  to  any  of  the  levels  difficult.  The  abstract 
interaction  handler  is  being  designed  to  implement  the 
syntactic  and  lexical  aspects  of  input/  and  those  parts  of 
the  syntactic  and  lexical  output  design  having  to  do  with 
interaction/  such  as  menus/  prompts/  and  error  messages. 

This  approach  allows  much  of  the  user  interface  to  be 
changed  by  modifying  the  interface  definition  made  available 
to  the  interaction  handler  rather  than  by  reprogramming.  It 
will  be  possible  to  use  two  completely  different  user 
interfaces/  such  as  menu  driven  and  command-language  driven/ 
with  the  same  application  program/  and  to  "fine-tune"  the 
details  of  a  given  user  interface.  Within  the  interaction 
handler/  syntactic  and  lexical  level  designs  will  be 
separated/  so  that  one  can  be  easily  changed  without 
affecting  the  other.  A  preliminary  design  of  an  interaction 
handler  can  be  found  in  FELD81. 
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Psycholo£ical  structure  in  information  organization  and  retrieval; 
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and  work  in  progress. 

George  W.  Furnas 

Computer-user  Psychology  Research  Group 
Bell  Laboratories,  Murray  Hill,  NJ 


Any  given  artificial  storage  and  retrieval  system  forces  structure 
on  the  information  stored  within  it.  Psychologically,  however 
many  kinds  of  structures  exist  for  the  representation  of 
information,  and  each  has  domains  where  it  is  well  suited  and 
domains  where  it  is  at  best  misfit.  The  motivating  assumption  here 
is  that,  if  one  wishes  to  make  information  systems  humanly 
accessible,  more  serious  consideration  is  needed  of  the  variety  of 
representations  characterizing  human  knowledge,  coupled  with  the 
necessary  invention  of  new  compatible  retrieval  interfaces. 

A  textile  dyer  would  no  doubt  be  exasperated  by  a  menu-driven,  or 
even  key  word,  specification  of  colors.  Our  knowledge  of  color 
space  argues  that  adjusting  thre'e  knobs,  or  perhaps  moving  a  light 
pen  on  a  graphics  screen  would  probably  be  m.uch  better.  In 
contrast,  asi:ing  zoo  visitors  to  access  information  about 
individual  animals  by  this  same  three-knob  technology  would  be 
ridiculous.  Menus  or  keywords  would  be  very  appropriate.  The 
domain  of  animals  has  a  very  different  structure  than  does  that  of 
color,  and  to  use  the  sam.e  retrieval  system,  for  the  two  is  a 
mistake. 

Not  much  experim.ental  evidence  exists  regarding  implications  for 
computer  access,  but  from  the  standpoint  of  reflecting 
psychological  similarity,  recent  work  by  Pruzansky.  Tversky  and 
Carroll  (1980)  emphasizes  the  diversity  of  appropriate 
representations.  Using  currently  available  scaling  procedures  in 
a  large  survey  of  categories,  they  typically  found  the  domains  to 
differ  strongly  in  the  relative  suitability  of  tree  and 
multidimensional  structures  for  capturing  people's  similarity 
judgements . 

There  are  of  course  even  more  representational  structures  than  the 
two  investigated  by  Pruzansky.  et  al.  From  the  context  of 
similarity  scaling  alone,  one  might  mention,  in  addition  to 
multidimensional  spaces  and  hierarchical  clusterings,  additive 
trees,  more  general  graphs,  factor-analytic  structures,  additive 
clusterings,  etc.  These  structures  differ  in  many  ways,  including 
continuity,  contingency  constraints  on  structural  components, 
complexity,  and  symmetry.  All  of  these  properties  presumably 
affect  representational  adequacy. 
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Scaling  techniques,  among  others,  can  help  to  identify 
psychological  adequacy  of  representations,  but  in  constructing 
retrieval  systems,  a  further  issue  arises:  How  can  any  of  the 
variety  of  possibly  appropriate  representational  structures  be 
accessed?  Hierarchical  tree  structures  lend  themselves  to 
classical  menu-tree  schemes,  and  multidimensional  configurations 
with  suitable  properties  (e.g.  low  number  of  dimensions, 
separability?)  may  perhaps  be  accessed  by  various  analog  input 
devices.  But  what  of  other  types  of  structures,  especially  as  we 
seek  richer  structural  representations? 

Thus  cognitive  considerations  motivate  the  search  for  nonstandard 
database  interface  solutions...  new  structures,  and  new  access 
processes.  The  work  presented  here  represents  a  simple  ongoing 
effort  in  that  direction.  It  basically  involves  a  generalization 
of  tree  structures,  and  of  the  corresponding  familiar  menu  access 
mechanisms . 

Standard  menu  systems  present  a  screenful  of  choices  subdividing 
the  domain  of  a  database.  The  user  makes  a  selection  from 
these,  resulting  in  a  new  set  of  more  detailed  selections,  further 
subdividing  the  selected  set.  A*  sequence  of  choices  from  a 
succession  of  menus  eventually  brings  the  user  t-j  some  final 
target  item.  Typically,  the  menus  are  organized  into  trees.  That 
is.  there  is  usually  only  one  sequence  of  choices  that  will  arrive 
at  any  given  target.  While  some  systems  have  exceptions  to  the 
unique  path  rule,  these  tend  to  be  infrequent,  and  certainly  not 
essential  to  the  character  of  the  system. 

Note  that  in  menu  trees,  there  are  m.any  choices,  a  whole  menu 
full,  presented  at  each  step  when  moving  down  .through  the 
structure.  There  are  occasions,  however,  when  one  must  move  back 
upward  in  generality,  as  in  recovering  from  a  mistake  or  changing 
targets  in  mid-searcn.  Then,  unlike  when  moving  downward,  there 
is  no  choice  given:  Trees  have  many  "down'  choices  at  any  point, 
but  only  one  up".  The  concept  being  explored  here  revolves 
around  allowing  menus  for  upward  choices,  as  well  as  the  usual 
downward  ones. 

The  psychological  motivation  goes  as  follows:  Consider  a  given 
node,  or  point  of  menu  presentation  in  the  structure,  to  represent 
a  conceptually  defined  class  of  possible  targets.  A  given 
conceptual  class  can  certainly  contain  many  different  subordinate 
classes,  enumerated  in  the  downward  menu,  but  often  in  rich 
domains  the  class  can  also  be  contained  in  many  superordinate 
classes.  A  traditional  tree  representation  is  forced  to  organize 
on  the  basis  of  only  one  superordinate  at  each  level.  In  so  far 
as  these  different  superordinates  may  each  be  useful  in  different 
circumstances,  this  psychological  organization  should  be  reflected 
in  the  access  structure,  by  giving  users  choice  when  moving  to 
superordinate  levels. 
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Imafine.  for  example,  one  had  a  computerized  system  for  retrievinf 
cookinf  recipes  that  was  "being  used  to  plan  a  meal.  Imagine 
further  that  the  user  had  proceeded  down  to  a  screenful  of  choices 
about  types  of  salad  (CAESAK,  SPINACH  &  MUSHHUUM .  etc.),  but  had 
just  decided  after  all.  againsx  any  salad  for  the  meal,  and  was 
ready  to  retreat  back  up  the  structure  to  other  categories  of 
choices.  Conceivably,  the  user  would  have  been  interested  in  an 
alternative  in  the  form  of  some  other  cold  food,  say  cold  cuts 
instead  of  salad,  so  that  a  superordinate  of  CuLU  PbOD  would  be 
appropriate  in  the  structure.  Alternatively,  it  might  have  been 
that  the  user  wanted  some  other  vegetable  dish,  so  that  a 
VEGETABLE  node  would  have  been  the  most  useful  superordinate.  Or 
perhaps  the  user  wanted  a  different  early  course  for  the  meal,  say 
soup  instead  of  salad.  Thus,  any  of  several  superordinates  (COLD 
FOODS,  VEGETABLE  DISHES.  EARLY  COURSE  DISHES)  might  have  been  what 
the  user  wanted.  Why  not  give  the  user  exactly  such  a  choice,  in 
an  Up  menu  from  the  salad  node,  in  addition  to  the  typical  Down 
menu?  If  the  user's  head  prominently  figures  a  certain  form  of 
representation,  externalize  it  in  the  organization  of  the  data, 
and  take  advantage  of  it  in  the  access  mechanism. 

V/e  are  in  the  midst  of  exploring  the  concept  of  up/down  menu  (KUD) 
systems  on  a  small  artificial  data  base  of  a  few  hundred  target 
items.  There  are  a  number  of  implementation  choices  that  require 
research,  moat  notably  regarding  how  to  construct  the  MUD 
structures:  In  using  normative- categorization  data,  various 
verification  and  "garbage  collection"  ideas  must  be  invoked  to 
ensure  that  links  exist  everywhere  they  are  appropriate,  and 
nowhere  else.  We  currently  ask  subjects  to  construct  "isa" 
networks  by  repeatedly  nominating  successive  superordinates  from 
each  node,  and  then  use  frequency  thresholds  on  nodes  and  links 
produced  across  subjects. 

When  other  subjects  are  then  allowed  to  use  the  MUDs.  several  more 
profound  issues  arise.  A  necessary  result  of  having  multiple  Up 
choices  is  that  Down  choices  are  not  always  partitions  of  the 
conceptual  class  encompassed  by  a  node.  The  consequence  that  that 
some  choices  overlap  is  of  mixed  advantage.  Under  some 
circumstances  it  allows  subjects  the  benefit  of  approaching  a 
target  with  different  interests  in  mind  or  with  a  different 
psychological  "set."  but  it  can  also  mean  that  subjects  must  not 
only  decide  whether  a  given  choice  will  lead  to  their  target,  but 
weigh  the  relative  merits  when  several  reasonable  choices  exist. 
Another  issue  is  that  KUD  structures  lack  the  systematic  traversal 
algorithms  that  trees  have.  Thus  it  is  more  difficult  to  be 
exhaustive,  i.e.  to  make  sure  all  nodes  have  been  seen  at  least 
once,  and  efficient,  i.e.  to  avoid  unnecessary  repetitive 
viewing  of  nodes-  Circumstances  exist  where  these  considerations 
might  be  important.  A  thi.'d  issue  is  that  the  class  of  targets 
actually  subsumed  by  any  downward  choice  is  constant,  while  the 
users  interpretation  of  the  choice  can  be  effected  by  the  history 
of  superordinates  just  passed  through.  In  a  tree,  there  is  only 
one  possible  ancestral  history,  so  no  ambiguity  arises,  but  not  so 
in  a  MUD  structure,  so  users  can  interpret  a  choice  variably,  due 
to  the  different  emphases  of  different  superordinates. 
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Some  issues  also  arise  in  working  with  MUDs  that  are  perhaps  even 
more  relevant  to  tree  structures.  Transitivity  of  class  inclusion 
is  critical  to  any  system  based  on  conceptual  hierarchy.  High 
level  choices  require  inferring  the  targets  subsumed  under 
intermediate  level  nodes.  Intransitivity  can  foil  this;  Suppose 
one  is  looking,  in  a  lay  person's  botanical  guide,  for  Scrub  Oaks 
which  are  classified  under  OAKS,  and  that  OAKS  are  in  turn 
classified  as  TREES.  The  problem  is  that  Scrub  Oaks  are  not 
popularly  considered  trees  (rather,  say-  shrubs).  This  lack  of 
transitivity,  due  perhaps  to  fuzzy  classification  systems,  would 
lead  one  away  from  a  correct  choice  of  TREES  in  the  pursuit  of 
Scrub  Oaks.  MUD  structures  have  an  advantage  over  menu  trees  since 
they  can  allow  other  routes  to  Scrub  Oaks  that  are  perhaps  free 
from  intransitivities. 

While  this  work  represents  only  one  modest  example  of  exploration 
of  more  diverse  p^chologically  motivated  structures,  we  believe 
that  efforts  like  it  can  lead,  to  systems  of  greater  help  to  human 
users. 
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The  Nature  of  User-Generated 
Commands  for  Interacting  with  a  Computer 


Mark  D.  Jackson 
Judith  E.  Tschirgi 

We  describe  the  results  of  an  experiment  investigating 
user  conceptions  of  a  natural  language  for  interacting  with 
a  computer  information  system.  Novice  and  experienced 
computer  users  performed  text  editing  and  information 
retrieval  tasks  using  a  simulated  interactive  system.  For 
each  task,  a  script  or  sequence  of  actions  was  presented 
to  the  user.  At  each  step,  users  read  a  description  of  an 
action,  such  as  correcting  errors  in  text  or  selecting  a 
page  of  information  to  view,  and  typed  a  command  that  they 
thought  was  a  natural  request  for  the  action.  If  their 
command  was  inappropriate,  they  were  asked  to  reword  their 
attempt;  otherwise  there  were  no  constraints  on  their  input. 

A  diverse  set  of  command  terms  was  generated  by  both 
groups  of  users,  with  few  actions  eliciting  common  terms 
from  within  or  across  groups.  Novices  used  more  English- 
like  command  formats  whereas  the  experts  followed  computer 
dialogue  conventions  learned  from  experience.  When  correcting 
commands,  novices  were  more  likely  to  use  strategies  applicable 
to  normal  conversation.  In  general,  the  responses  generated 
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by  our  users  under  instructions  to  produce  "natural"  commands 
dei^ended  critically  on  the  user’s  experience. 


We  have  shown  that  computei-  experiencr.  cifi’ects  users’ 
spontaneous  approaches  to  inter actinp,  vn.th  •;  compurer. 

Our  results  suggest  tnat  no  single  coiinriand  synlax  or  set 
of  lexical  items  will  be  consistent  with  tin*  expe ccacior.s 
of  all  users  \;itliout  additional  training.  Thus,  future 
research  must  determine  the  characteristics  of  easily 
learned  person- computer  dialogues  that  are  "•na;.ural"  across 
all  levels  of  experience. 
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A  Conceptual  Approach  to  Natural  Language  Fact  Retrieval 

Janet  L.  Kolodner 

School  of  Information  and  Computer  Science 
Georgia  Institute  of  Technology 
Atlanta,  GA  30332 


1 .  The  problems 


person:  What's  been  going  on  in  the  world  recently? 
computer:  The  last  hostage  was  finally  released  from  Iran, 
person:  I  thought  the  hostages  were  released  all  at  once, 
computer:  She  wasn't  really  one  of  the  hostages,  but  was 
arrested  later  when  she  traveled  to  Iran  as  a 
journalist  to  cover  the  Iranian  revolution. 
She's  been  referred  to  as  the  53rd  hostage, 
person:  Why  was  she  arrested? 
computer:  Trumped-up  espionage  charges. 


Suppose  we  wanted  to  build  an  intelligent  fact  retrieval  system 
such  as  the  one  above.  What  would  that  require?  It  would  have  to  be 
able  to  deal  intelligently  with  a  human  user,  giving  answers  containing 
not  only  the  appropriate  information,  but  also  the  right  amount  of 
information.  It  would  have  to  be  able  to  analyze  the  intent  of  a  human 
question  or  response,  figuring  out  what  the  questioner  really  wanted  to 
know.  The  system  would  also  have  to  be  able  to  search  its  memory  in  a 
smart  way,  so  that  as  the  memory  grew,  it  would  still  respond  in  a 
r'’.asonable  amount  of  time. 


There  are  three  major  problem  areas  to  be  addressed  in  designing 
such  a  system: 


1.  Interfacing  with  the  user:  analyzing  his  natural  language 
questions,  and  deriving  search  keys  from  them 

2.  Memory  search 

3.  Memory  organization  and  maintenance 
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These  problems  cannot  be  solved  independently  of  each  other.  The 
organization  of  memory  constrains  the  types  of  retrieval  and  updating 
processes  the  memory  can  have.  On  the  other  hand,  memory  organization, 
and  therefore  procedures  for  adding  information  to  memory,  must  be 
designed  based  on  retrieval  requirements.  Similarly,  memory's  organiza¬ 
tion  and  content,  and  the  relationship  between  items  and  categories  in 
memory  should  be  taken  into  account  in  interpreting  the  intent  of  user 
questions. 

The  CYRUS  system  has  dealt  with  aspects  of  all  three  of  these 
problems.  CYRUS  has  a  long  term  memory  which  was  designed  to  store 
information  about  important  political  dignitaries.  It  has  been  used  to 
store  and  retrieve  information  about  former  Secretaries  of  State  Cyrus 
Vance  and  Edmund  Muskie.  CYRUS  automatically  adds  new  information  to 
its  memory,  maintaining  good  memory  organization  in  the  process.  It  can 
be  queried  in  English,  and  uses  retrieval  strategies  and  knowledge  about 
the  organization  of  its  memory  to  search  for  answers.  A  successor  to 
CYRUS,  TED,  will  keep  track  of  events  in  the  life  of  Ted  Turner,  a 
celebrity,  sports  figure,  businessman,  and  broadcasting  figure. 

The  remainder  of  this  paper  will  outline  some  of  the  problems 
involved  in  designing  a  fact  retrieval  system  which  will  communicate 
effectively  with  people.  Interactions  between  the  interface,  memory 
search,  and  memory  organization  will  be  described.  It  will  also  outline 
the  solutions  to  these  problems,  as  implemented  in  CYRUS  and  described 
in  Kolodner  (1980). 

In  considering  these  problems,  we  will  assume  a  memory  organized  by 
conceptual  categories,  with  events  indexed  and  sub-indexed  in  those 
categories  by  their  salient  features.  Thus,  memory  processes  will 
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manipulate  conceptual  information,  or  the  meaning  of  the  data  in  the 
memory,  and  will  not  be  concerned  with  the  words  used  to  express  those 
concepts . 

2 .  Retrieval  requirements 

2.1  Choosing  a  category  for  search 

Searching  a  memory  organized  in  categories  requires  specification 
of  a  category  or  categories  to  be  searched.  Consider,  for  example,  the 
following  question: 

(Q1):  Nr.  Vance,  when  was  the  last  time  you  saw  an  oil  field 
in  the  Middle  East? 

If  "seeing  oil  fields"  were  one  of  memory’s  categories,  then  this 
question  would  be  fairly  easy  to  answer.  "Seeing  oil  fields"  would  be 
selected  for  search.  If  it  indexed  an  episode  in  the  Middle  East,  that 
episode  could  be  retrieved  from  it.  Similarly,  if  "seeing  objects"  were 
a  memory  category,  it  could  be  selected  for  retrieval  and  events  in  the 
Middle  East  and  events  at  oil  fields  could  be  retrieved. 

If  neither  of  these  categories  existed,  however,  a  category  for 
search  would  have  to  be  chosen.  We  can  imagine  the  following  reasoning 
process  being  used  to  do  that: 

A1:  An  oil  field  is  a  large  sight,  perhaps  I  saw  an  oil  field 
during  a  sightseeing  episode  in  the  Middle  East. 

Using  information  about  episodic  contexts  associated  with  "large 
sights",  a  "sightseeing"  category  can  be  chosen  for  retrieval.  Its 
contents  can  be  searched  for  an  episode  at  oil  fields  in  the  Middle 
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East,  If  the  sightseeing  category  organized  its  episodes  according  to 
the  type  of  sight  and  its  part  of  the  world,  and  if  there  had  been  an 
episode  in  the  Middle  East  at  an  oil  field,  then  ”a  sightseeing  episode 
at  an  oil  field  in  the  Middle  East”  could  be  retrieved. 

The  problem  of  choosing  a  category  for  search  is  both  an  interface 
problem  and  a  search  problem.  Search  requires  specification  of  a 
category  to  be  searched.  For  a  very  complex  data  base,  however,  we  can¬ 
not  expect  a  user  to  know  all  of  memory's  categories.  Nor  can  we  expect 
that  every  natural  language  question  asked  of  a  data  base  will  specify  a 
category  for  search. 

In  CYRUS,  this  problem  is  solved  by  associating  with  each  concept 
in  memory  the  categories  it  is  related  to.  Thus,  the  concept  "large 
sights"  has  "sightseeing"  associated  with  it,  while  "international 
contract"  has  the  category  "political  meetings"  associated  with  it.  In 
the  first  step  of  the  retrieval  process,  the  conceptual  representation 
of  the  question  (produced  by  a  conceptual  analyzer)  is  checked  to  see  if 
it  already  specifies  a  category  for  search.  If  not,  contexts  are  chosen 
from  among  the  categories  associated  with  each  of  the  question  com¬ 
ponents. 

2 . 2  Non-enumeration 

One  of  the  most  important  problems  to  address  in  designing  an 
interactive  retrieval  system  is  the  following: 

Retrieval  should  not  have  to  slow  down  as  memory  grows. 

This  requirement  constrains  both  the  retrieval  processes  and  the  memory 
organization.  In  terms  of  the  retrieval  processes,  it  requires  the  fol¬ 
lowing: 
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Retrieval  from  a  category  must  be  able  to  happen  without 

enumeration  of  the  category. 

In  fact,  this  interface  problem  depends  on  both  memory  organization  and 
retrieval  processes  for  a  solution.  If  categories  cannot  be  enumerated, 
then  there  must  be  some  other  way  of  searching  a  category.  This  can  be 
done  by  indexing  items  intelligently  in  categories,  and  then  by  specify¬ 
ing  and  following  appropriate  indices  during  retrieval. 

This  method  of  retrieval  brings  up  special  problems.  Retrieval  is 
easy  if  a  question  specifies  features  vrtiich  are  indexed.  This  is  not 
always  the  case,  however.  Two  solutions  to  this  problem  have  been 
implemented  in  CYRUS  —  automatic  generation  of  plausible  indices,  and 
search  for  alternate  contexts. 

2.2.1  Index  fitting  and  generation  of  plausible  features 

Just  as  we  cannot  expect  a  user  to  know  all  of  memory's  categories 
or  to  specify  a  category  in  his  question,  we  cannot  expect  him  to  know 
memory's  indexing  scheme.  Thus,  features  specified  in  a  question  might 
not  correspond  to  features  indexed  in  memory.  In  that  case,  given 
features  must  be  transformed  into  indexed  features. 

Inferring  Indexed  features  is  a  way  of  directing  search  within  a 
memory  category  without  enumerating  the  category.  Generated  features 
can  be  followed  to  find  the  target  item  in  the  category.  In  addition, 
there  must  be  a  way  of  recognizing  that  two  different  descriptions  refer 
to  the  same  item.  One  way  to  do  that  is  by  transforming  one  description 
into  the  second  one. 

Continuing  with  the  example  above,  suppose  sightseeing  episodes 
were  not  organized  in  a  category  according  to  the  type  of  sight  or  by 
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their  place  in  the  world.  In  that  case,  the  following  elaboration  of 


the  initial  retrieval  specification  might  be  appropriate  to  answer  the 
question; 

A2:  Vfhich  countries  in  the  Middle  East  have  oil  fields?  Iran 
and  Iraq  have  oil  fields,  and  Saudi  Arabia  does. 

If  sightseeing  episodes  are  organized  according  to  the  country  they 
took  place  in,  then  elaborating  on  "the  Middle  East"  and  specifying 
particular  countries  in  the  Middle  East  would  enable  retrieval  of 
episodes  that  took  place  in  each  of  those  places.  Instead  of  searching 
for  "sightseeing  at  an  oil  field  in  the  Middle  East",  search  for  each  of 
the  more  specific  episodes  "sightseeing  at  an  oil  field  in  Iran",  "sigh¬ 
tseeing  at  an  oil  field  in  Iraq" ,  etc .  could  be  attempted . 

The  process  of  transforming  given  features  into  indexed  ones  is 
called  index  fitting.  Index  fitting  is  done  in  CYRUS  by  component- 
instantiation  rules.  These  rules  use  information  about  components  in 
context  to  infer  additional  features  of  a  specified  item.  The 
nationality  of  participants  in  a  political  meeting,  for  example,  is 
known  to  correspond  to  the  sides  of  the  contract  being  discussed  at  the 
meeting.  Given  the  participants  in  a  meeting,  that  information  can  be 
used  to  infer  aspects  of  the  meeting  topic.  Component  instantiation 
rules  generate  plausible  features  for  a  targetted  item.  These  features 
correspond  to  indices  which  should  be  traversed  to  retrieve  that  item 
from  memory. 

2.2.2  Alternate  context  search 

Elaboration  of  plausible  features  is  only  one  way  of  directing 
search,  and  it  is  not  always  successful.  Suppose,  for  example,  that 


39 


there  was  not  enough  information  to  narrow  a  search  key  to  an  easily 
enumerable  (i.e.,  small)  part  of  the  data  base.  In  a  memory  where 
records  refer  to  other  contextually  related  records,  it  might  instead  be 
appropriate  to  search  memory  for  an  alternate,  more  retrievable  context. 
In  other  words,  retrieval  can  proceed  by  searching  for  a  related  context 
which  (1)  might  be  more  retrievable  than  the  target  item,  and  (2)  might 
refer  to  the  item  targetted  for  retrieval. 

Since  CYRUS*  memory  is  organized  in  event  categories,  alternate 
context  search  in  CYRUS  corresponds  to  search  for  an  episode  related  to 
the  targetted  event.  Since  sightseeing  in  the  Middle  East  would  have 
had  to  happen  during  a  trip  to  the  Middle  East,  retrieving  a  trip  to  the 
Middle  East  could  aid  retrieval  of  an  appropriate  sightseeing 
experience.  Thus,  the  following  reasoning  would  also  be  appropriate  to 
answer  the  question  above. 


A3:  In  order  to  go  sightseeing  in  the  Middle  East,  I  would 
have  had  to  have  been  on  a  trip  there.  On  a  vacation 
trip,  I  wouldn't  go  to  see  oil  fields,  so  I  must  have  been 
taken  to  oil  fields  during  a  diplomatic  trip  to  the  Middle 
East.  Which  countries  might  have  taken  me  to  see  their 
oil  fields?  Saudi  Arabia  has  the  largest  fields,  perhaps 
they  took  me  to  see  them.  Yes,  they  did  when  I  was  there 
last  year. 


Why  does  it  seem  reasonable  to  search  for  "trips”  when  a  "sigh¬ 
tseeing"  episode  should  be  retrieved?  How  can  search  for  alternate 
events  be  constrained?  Only  alternate  contexts  that  might  be  related  to 
an  event  targeted  for  retrieval  should  be  searched  for. 

In  general,  for  search  to  be  constrained  to  relevant  contexts, 
memory  categories  must  hold  generalized  information  concerning  the 
relationships  of  their  items  to  items  in  other  memory  categories.  In 
CYRUS,  alternate  context  search  is  facilitated  by  three  things: 
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1 .  knowledge  of  the  usual  relationships  between  event 
categories 

2.  a  set  of  context  construction  rules  for  constructing  a  new 
context  based  on  that  knowledge 

3.  a  set  of  search  strategies  for  directing  search  for  the 
target  event  within  the  context  of  the  alternate  event 

Thus,  CYRUS  knows  about  the  usual  relationship  between  sightseeing  and 
trips,  how  to  construct  a  trip  context  based  on  a  sightseeing  context, 
and  how  to  search  the  sequence  of  events  of  the  trip  to  find  a  sigh¬ 
tseeing  experience  once  an  appropriate  trip  is  found. 

2.3  Maintaining  a  conversational  context 

Maintenace  of  a  conversational  context  is  necessary  for  resolution 
of  ambiguous  references,  anaphora,  and  pronominal  reference.  Suppose, 
the  question  above  were  followed  in  conversation  by  the  following  one: 

(Q2):  Did  you  talk  to  the  workers  there? 

In  order  to  understand  what  "there"  means,  the  answer  to  the  previous 
question  must  be  consulted.  In  order  to  understand  which  workers  are 
being  talked  about,  the  context  of  "visiting  oilfields",  plus  knowledge 
about  oilfields  themselves  must  be  used. 

Maintenance  of  a  conversational  context  can  also  constrain  memory 
search.  Often,  it  is  necessary  to  search  only  the  context  of  the  answer 
to  the  previous  question  to  find  an  answer  to  the  current  one.  In  the 
example  above,  for  example,  only  the  events  Involved  in  Vance’s  visit  to 
the  oilfield  in  Saudi  Arabia  need  be  searched  for  an  answer.  If  the 
previous  context  is  maintained,  it  can  constrain  search  to  that  episode 
only,  so  that  all  of  memory  does  not  have  to  be  searched. 


41 


2.4  Summary  of  retrieval 


The  retrieval  process  described  can  be  seen  as  a  process  of 
reconstructing  what  might  be  true,  and  checking  memory  to  make  sure  it 
indeed  was.  To  retrieve  an  episode  of  "seeing  oilfields",  a  hypothesis 
was  made  about  the  type  of  event  it  might  have  been  (sightseeing),  where 
it  might  have  happened  (Iran,  Iraq,  Saudi  Arabia,  etc.),  and  what  else 
might  have  been  going  on  at  the  time  (a  trip). 

Judging  from  this  example,  the  process  of  retrieval  requires  at 
least  the  following  processes: 

1 .  selection  of  a  category  for  search 

2.  search  within  the  category  for  the  targeted  event 

3.  elaboration  on  the  specification  of  the  event  to  be 
retrieved 

4.  search  for  episodes  related  to  the  target  event 

3.  Requirements  on  the  memory  organization 

The  ability  of  memory  to  support  retrieval  without  enumeration  is 
also  dependent  on  the  memory  organization.  The  traditional  solution 
within  computer  science  to  the  non-enumeration  problem  is  to  index  items 
within  categories.  An  event  should  be  indexed  in  a  category  by  those  of 
its  features  that  are  salient  to  the  category.  In  that  way,  specifica¬ 
tion  of  an  indexed  feature  will  enable  retrieval  of  items  with  that 
feature  without  enumerating  the  whole  category. 

If  memory  categories  are  heavily  indexed  by  salient  features, 
retrieval  processes  will  have  a  large  selection  of  features  to  specify, 
any  of  which  might  specify  a  target  event.  The  retrieval  process  will 
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be  made  easier  since  the  easiest  elaborations  can  be  attempted  first. 

The  richer  the  indexing,  however,  the  more  space  is  needed  for 
storage .  Indexing  must  be  controlled  so  that  memory  does  not  grow 
exponentially.  In  CYRUS,  similarities  between  events  are  used  to 
control  indexing.  Memory  keeps  track  of  the  similarities  between  events 
within  a  category,  and  limits  indexing  to  the  differences  between 
events.  Thus,  if  almost  all  the  events  in  a  "diplomatic  meetings" 
category  are  with  foreign  diplomats,  indexing  them  according  to  the 
occupations  of  their  participants  would  be  redundant  and  therefore 
unnecessary.  It  would  not  divide  the  category  into  significantly  smal¬ 
ler  parts.  If,  however,  one  of  those  meetings  were  with  someone  other 
than  a  foreign  diplomat,  indexing  the  meeting  by  that  feature  would 
differentiate  it  from  other  events  in  the  category.  In  fact,  the 
similarities  which  constrain  indexing  correspond  to  the  generalized 
information  necessary  for  retrieval. 

Finally,  a  memory  for  events  should  maintain  Itself.  This  means 
that  the  process  of  selecting  indices  should  be  automated.  It  also 
means  that  events  must  be  sub-indexed  within  the  sub-categories  that  are 
formed  when  multiple  events  are  indexed  in  the  same  way.  Otherwise,  the 
sub-categories  would  have  to  be  enumerated.  This  places  another 
requirement  on  the  updating  processes.  In  order  to  constrain  later 
indexing,  and  in  order  to  guide  the  retrieval  strategies,  the  automatic 
updating  process  must  also  keep  track  of  the  similarities  between  events 
in  each  newly-created  sub-category.  If  we  don’t  want  retrieval  to  slow 
down  as  new  events  are  added  to  memory,  then  memory  must  be  able  to 
maintain  its  organization,  creating  new  conceptual  categories  when 
necessary  and  building  up  required  generalized  information.  CYRUS  does 
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this  through  a  series  of  organizational  strategies. 


Another  aspect  of  maintaining  memory’s  organization  involves 
monitoring  memory  search.  More  frequently  requested  information  should 
be  more  accessible  than  less  frequently  requested  information,  and  more 
recently  accessed  information  should  be  more  accessible  than  less 
recently  accessed  information.  This  involves  both  reorganization  of 
memory  taking  frequency  of  access  into  account  and  restructuring  the 
organizational  strategies  themselves,  so  that  more  frequently  asked  for 
types  of  information  will  automatically  be  organized  for  accessibility 
as  they  are  added  to  the  data  base.  This,  and  other  memory  maintenance 
problems  which  have  not  been  described  here,  are  being  addressed  in 
current  and  future  research. 
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Psychological  Investigations  of 
Natural  Command  and  Query  Terminology 

Thomas  K .  Landauer 
Susan  T.  Dumais 

Computer-user  Psychology  Research  Group 
Bell  Laboratories,  Murray  Hill,  NJ 


It  is  frequently  asserted  that  unsophisticated  users  v;ould 
find  computer  systems  more  congenial  if  communications  with 
them  were  to  employ  more  "natural”  words.  In  a  series  of 
empirical  studies,  we  have  (1)  developed  a  method  for  iden¬ 
tifying  natural  command  words  for  a  particular  task,  (2) 
tested  the  value  of  the  resulting  natural  command  lexicon 
in  the  initial  stages  of  transfer  from  manual  to  automated 
task  performance,  and  (3)  induced  people  to  form  "natural" 
data  queries  and  analyzed  the  language  they  used. 

Identification  of  "natural"  command  terms.  Twenty- two  stu¬ 
dents  in  secretarial  schools  and  twenty-six  high  school 
students  with  typing  skills  were  given  manuscripts  with 
author's  marks.  The  author's  marks  indicated  a  variety  of 
desired  corrections  corresponding  systematically  to  the 
kinds  of  changes  that  are  accomplished  in  manual  or  compu¬ 
ter  text-editing  operations.  The  students  were  asked  to 
write  instructions  to  another  typist,  who  did  not  have  the 
author's  marks,  specifying  what  was  to  be  done  to  the 
manuscript.  This  method  produced  verbal  descriptions  of 
actual  editing  operations  (e.g.  "take  out  the  word  the " ) 
as  contrasted  to  description  of  the  author's  marks  (e.g. 
"crossout")  or  goal  (e.g.  "fix  the  spelling").  Among 
noteworthy  resulting  observations  were  the  following: 

(1)  There  was  little  agreement  on  word  use;  e.g.  the  three 
most  frequent  operational  verbs  used  accounted  for  no  more 
than  33%  of  descriptions  of  any  one  correction,  (2)  The  words 
used  were  not  like  those  commonly  employed  by  computerized 
editing  systems,  e.g.  the  verb  "delete"  was  never  used,  and 
(3)  Unlike  many  computerized  text-editing  systems,  students 
and  secretaries  tended  to  use  different  words  to  describe 
operations  on  characters  and  blanks,  but  the  same  words  to 
describe  similar  operations  on  whole  lines  and  line-internal 
strings  (e.g.  "change  'string  a  or  line  a'  to  'string  b  or 
line  b'"). 

Testing  the  value  of  natural  command  terms  for  initial  learning. 
We  devised  a  set  of  miniature  text-editing  systems,  each  con¬ 
sisting  of  only  append,  delete,  and  substitute  operations  plus 
start  and  stop  commands.  For  one  version,  bhe  verbs  used  in 
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the  operation  coitimands  were  "append",  "delete"  and  "substitute", 
terms  often  used  in  computer  text-editors.  For  another,  they 
were  the  verbs  most  frequently  used  by  secretaries  and  typists 
to  describe  the  required  action,  "add",  "omit",  and  "change", 
respectively.  A  third  variant  used  randomly  chosen  English 
verbs,  "cipher",  "allege",  and  "deliberate"  as  a  baseline 
control  for  lexical  naturalness.  In  addition,  the  text- 
editors  varied  (a)  with  respect  to  whether  the  command  verb 
was  to  be  spelled  out  or  abbreviated  to  its  first  letter, 
and  (b)  with  respect  to  whether  the  same  command  word  applied 
to  both  line-internal  strings  and  whole  lines  (e.g.  "omit  /a/" 
for  within  -  and  "omit"  for  whole-line)  or  used  different 
command  words  (e.g.  "change  /a//"  for  within-line  and  "omit" 
for  whole-line) .  Forty-eight  secretarial  and  typing  students 
each  spent  about  two  hours  studying  an  introductory  self- 
instructing  manual  and  simultaneously  doing  a  series  of  on-line 
learning  and  test  exercises.  The  manuals  varied  only  in  neces¬ 
sary  ways  (essentially  only  in  command  names)  and  as  little 
extra  help  as  possible  was  provided. 

The  main  results  of  interest  were  as  follows;  (1)  The  time 
to  perform  test  exercises  was  not  significantly  influenced  by 
command  name  variations;  subjects  performed  as  well  when  they 
were  learning  to  "allege",  "cipher",  and  "deliberate"  as  when 
they  were  learning  to  "add",  "omit"  and  "change".  However,  a 
post-session  questionnaire  revealed  some  subjective  preference 
for  the  more  familiar  terms.  It  is  also  important  to  note 
that  the  subjects  were  learning  a  very  simple  system  with  very 
few  terms,  and  that  they  were  not  required  to  remember  the 
terms  over  substantial  periods.  It  is  possible  that  "natural" 
terms  would  be  advantageous  in  larger  lexicons  or  when  long- 
range  recall  was  necessary.  However,  natural  words  do  not 
appear  to  provide  substantial  benefit  during  the  highly  cri¬ 
tical  first  few  hours  of  introduction  to  the  new  and  exotic 
computer  aided  text-editing  environment,  as  one  might  have  ex¬ 
pected  and/or  hoped.  (2)  Abbreviated  command  names  were 
slightly  moro  time-consuming  to  use  at  first,  but  became  sig¬ 
nificantly  less  so  after  some  practice.  (3)  In  this  case,  at 
least,  the  use  of  different  commana  names  for  whole-line  and 
within-line  operations  resulted  in  better  performance  than 
using  the  same  name  for  both.  This  is  contrary  to  subjects' 
usage  in  spontaneous  descriptions.  We  hypothesize  that  the 
requirement  to  use  different  syntactic  constructions  in  our 
editors  was  responsible;  that  differing  command  words  make  it 
easier  to  learn  and  use  differing  constructions  even  if  the 
operations  are  naturally  thought  of  as  similar. 

Characteristics  of  natural  data  specifications.  Three  hundred 
and  thirty-seven  college  students  tried  to  specify  verbal 
objects.  They  were  given  a  list  of  items  like  "newsweek", 
"Empire  State  Building",  etc.  and  asked  to  try  to  specify  each 
so  that  another  student  or  (in  other  cases)  a  computer  would 
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respond  with  the  provided  word.  There  were  no  restrictions 
as  to  the  form  or  content  of  the  descriptions  (except,  of 
course,  that  they  could  not  contain  the  target  item) . 

Among  interesting  characteristics  of  the  response  were  these: 
(1)  Students  rarely  used  boolean  expressions  more  complicated 
than  simple  conjunction.  (2)  Specification  by  exclusion 
(e.g.  "a  popular  weekly  newsmagazine  other  than  Time ” )  was 
very  infrequent  despite  the  intentional  inclusion  of  items 
that  easily  admitted  of  such  specification.  (3)  The  most 
common  specification  techniques  were  simple  lists  of  positive 
attributes  or  a  single  immediate  superordinate,  followed  by  a 
list  of  attributes  (e.g.  "a  tall  building  in  New  York  located 
on  34th  Street  and  5th  Avenue").  (4)  Specifications  were 
often  very  vague  and  depended  heavily  on  presuppositions  about 
preferred  responses  of  the  target  person  or  system  (e.g.  "a 
tall  building  in  New  York",  a  specification  that .apparently 
assumes  that  one  member  of  a  large  class  will  be  known  to  be 
most  representative  or  most  dominant  and  will  be  given  in  the 
absence  of  further  specification) . 

We  have  no  evidence  as  yet  as  to  whether  systems  allowing 
"natural"  query  specifications  would  be  easier  to  use. 
However,  it  does  seem  apparent  that  the  use  of  more  precise 
expressions  cannot  be  expected  without  special,  perhaps  dif¬ 
ficult,  training. 
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ORGANIZING  MEMORY  FOR  USE  IN  UNDERSTAND I^X3 


by 

Miciiasi  LebowiLz  -•  C-iiuinbi:i  Uni'/or-iity 

I  Introduction 

Episodic  memory  plays  an  important  role  In  the  understanding  o£  natural 
language.  It  can  oe  used  to  piovide  context  for  top-down  processing,  to 
determine  the  segments  of  a  text  that  should  be  focused  upon, 
situation-dependent  d'-fajJts,  and  so  forth.  While  this  sh.ould  come  as  no 
great  surprise,  it  is  the  case  that  most  of  the  work  relating  meitior'/  (in  the 
form  of  databases)  and  language  understanding  has  emphasized  the  utility  of 
natural  language  front-ends  for  databese  query  (  ’’liarris  78,  Kaplan  77,  Woods 
and  Kaplan  72] ,  for  example) ,  rather  than  the  ways  that  memory  can  be  used  in 
language  processing.  Furthermore,  what  wrk  there  has  been  on  using  memory 
for  language  processing  has  been  m  the  form  of  question  answering,  ignoring 
entirely  the  crucial  issue  of  using  existing  knowledge  in  memory  ta  h.elp 
acquire  more  information.  Yne  us?  of  memor/  in  the  process  of  reading  text 
for  the  purpose  of  updating  memoty  -  and  the  effect  tnis  has  on  'n  iroi:'/ 
organization  -  is  extremely  important,  and  is  the  issue  I  will  address  nore. 

In  the  course  of  this  brief  presentation  I  will  be  using  examples  from  a 
computer  model  that  is  concerned  with  the  relation  between  language  and 
memory.  IPP  (the  Integrated  Partial  Parser),  written  at  Yale,  is  able  to  read 
nev/s  stories  about  terrorism  and  record  them  in  a  coherent  memory.  It  makes 
generalizations  th^t  help  organize  th:  memories  of  the  events  described  and 
are  used  to  assist  in  later  processing.  IPP  t.-  fully  describe<!  in  [Lebowitz 
80).  A  second  prograni,  RFSEARCHEF; ,  is  in  ^rv'  -  'rly  stages  of  Juvclornent.  ’t 


will  be  based  upan  IPP,  but  will  include  <»  r,'.c‘i:iory  ol  a  S''J'.“ril:iE]c  domain, 
built  up  by  readiny  technical  abstracts.  ..o  tun  comp.'oyicy  oi  »’he 

material  that  RSSEARCHER  will  be  reaiiny.  •.!!';  ice  o!  n>cmof/  In  r;he 
understanding  process  will  be  extreii.ei.y  important. 

The  point  that  I  want  to  stress  her^  is  that  tlic  neoci  for  applying 
information  from  me:nory  during  understanding  (knowlod^'e  acquisition)  must  he 
considered  while  attempting  to  determi.ne  an  appropriate  memory  organisation. 
In  the  space  avaiJable  here  I  will  give  several  examples  illustrating  the  need 
for  the  application  of  episodic  memory  to  understanding,  and  then  outline  an 
appropriate  memory  organisation  that  keeps  this  use  in  mind. 

2  Why  we  need  to  use  memory  in  understanding 

The  following  story  is  rather  typical  of  those  read  by  IPP. 

Figure  1:  Attack  on  kibbutz 
SI  -  UPI,  7  April  RP,  Israel 

Israeli  troops  today  stormed  a  children's  dormitory  in  a  kibbutz  on 
the  Lebanese  border  to  free  hostages  seized  nine  hours  earlier  by 
gun-blazing  Palestinian  guerrillas  and  killed  all  five  raiders. 

There  are  two  problems  in  understanding  story  Si  that  memory  car.  help 
overcome.  The  first  involves  the  meani.ng  of  the  word  "storne<J'',  which  in  this 
domain  can  refer  to  either  terrorists  attacking  a  building  or  government 
officials  counterattacking  a  group  of  terrorists.  A  similar  problem  arises 
with  "seized",  which  could  plausibly  refer  to  either  a  kidnapping  or  a 
building  takeover.  The  later  ambiguity  is  in  fact  never  resolved  :n  this 
text.  Each  of  these  problems  is  easi'y  overcome  by  accessing  the  proper 
information  from  memory,  generalizations  such  as  those  in  the  next  figure, 
made  after  reading  earlier  stories- 
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Figure  2:  General izatioas  about-  ('‘-xtori.i<'''i  in  Israel 
Israeli  troops  ,'c-rry  out  c’OunL£rattac‘''.s  against  terrorists. 

Palestinic.is  in  Israel  engage  in  extortior:  by  taking  places  ovot . 

Eioth  ambiguous  in  SI  can  ho  resolved  by  jssiur.lng  that  when  relevant 

generalizations  exi.'c,  •/'ords  should  be  JisanLiguat-o  so  that  the  nev/  ntety 
fits  the  existing  general izations.  The  first  generalization  allows  the 
disambiguation  cl  "storr^iod"  as  ic  is  reao,  iisi:;g  this  rule.  Similarly,  we 

assume  "seized”  indicates  a  takeover,  since  that  corresponds  to  the  second 

generalization,  had  U;o  general izatior.  stated  that  extortions  in  Israel  were 
usually  kidnappingc,  then  "seized"  would  have  been  assumed  to  refer  to  such  an 
event. 

Notice  that  we  cannot  expect  a  person  (or  computer  program)  to  be 

pre-supplied  with  all  the  generalizations  necessary  to  resolve  problems  of 
this  sort.  Instead,  these  observations  must  be  developed  by  reading  (or 

otherwise  learning  about)  specific  events  and  generalizing  from  them. 

The  following  story  also  requires  information  from  memory. 

Figure  3:  Basques  implicit  in  attack 
S2  -  New  York  Time:;,  24  August  79,  Spain 

Bombs  exploded  in  c.  French  ban’-,  and  a  French  i:.imlgrati.;r.  office  in 
north.ern  Spain  eat  iv  today,  causing  damage  but  no  injuries,  according 
to  police. 

This  story  does  .not  specify  the  identity  of  the  terrorists  who  set  off 
tiie  explosion  devcribeci.  Howe-'er,  most  {X;ople  with  some  knowledge  Spain 

are  aware  that  this  wss  probably  a  Basque  at^-ack.  Such  a  conclusion  comes 
from  a  previously  mcide  general  ization  about  terrorists  in  Spain. 
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The  next  figure  shows  how  IP!  h.ndlcs  story  F2  vhen  it  has  existing  in 


memory  a  generalization  tiiat  Basques  are  the  att<jckity  i'-,  bombings  in  Spain. 

Figure  4:  IPP  inferring  defauJt  f  1 1 'r  fe-a';.-;.  s 

Generalization  (3ASQ'JE-GKN)  already  in  menKuy: 

S-DESTRUCTI W-ATFACK  w  i  th : 


ACTOR 

(1) 

DEMAND-TYPE 

SEPAIx-ATl'iM 

NATIONALITY 

BASQUE 

METHODS 

(1) 

AU 

SEXPLODE-BOflB 

LOCATION 

(1) 

AREA 

WESTERN- EUROPE 

NATION 

SPAIN 

RESULTS 

ill 

AU 

CAUSE  -P/vMAGi; 

♦(PARSE  S2) 

Story:  S2  (8  24  79)  SPAIN 

(Ba4BS  EXPLODED  IN  A  FRE^JCH  BANK  AND  A  FRENCH 
IMMIGRATION  OFFICE  IN  NORTHERN  SPAIN  EARLY  TODAY 
CAUSING  DAMAGE  BUT  NO  INJURIES  ACCORDING  TO  POLICE) 


»>  Beginning  final  memory  incorporation  ... 


Feature  analysis: 
RESULTS 
METHODS 
LOCATION 


EV16  (S-DESTRUCTIVE-ATTACK) 
AU  CAUSE-DAMAGE 

AU  $EXF'LODE-BOMB 

AREA  WESTERN-EUROPE 

NATION  SPAIN 


Indexing  EV16  as  variant  of  BASQUE-GEN 

Inferring  feature  ACTOR  DEMAND-TYPE  SEPARATISM 
of  EV16 


</< - 


Inferring  feature  ACTOR  NATION  BASQUE 
of  EV16 

»>  Memory  incorporat  ion  complete 


In  this 
generalization 
generalization 


example. 

IPP 

leco.j.ii.  os  th.il 

S2  5  ?  an 

:•  ristance  of 

tnat  it 

has 

made  previou.sly 

(L.'./;;UL-GEN) 

.tad  ‘ijr 

to  supply  default  charfcte''isti'’.v  of  -he  terrorist;;.  in 
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particular,  IPP  assumes,  corresiX)nd  ir.cj  with  the  '^ener.'il  ion,  that  the 


terrorists  are  Basque  separatists.  The  deterni'nitjon  of  defauUs  of  :his 
is  a  major  use  of  generalizations.  IPP  also  rv'fxes  this  event  as  an  i:;st-'r.ce 
of  the  most  relevani"  cj'^neralizritior.,  so  !nnt  •-  •  •!.  ; -rri'^e  it  latc-t  t.e  mate 
further  generalizations.  I  will  sny  irorc  about  ini-?  last  point  beJow. 

3  Organizing  meinory  for  understanding 

Examples  such  as  31  and  pL:r:e  se,;ra:  ccnstraincs  tapt-u:  th»' 

organization  for  memory.  In  particular: 

1.  It  m  be  possible  to  access  generalizations  based  on  partial 

information  so  that  relevant  information  can  be  applied  during 
understanding,  and  not  just  after  it  has  been  completed. 

2.  Many  different  features  of  a  general izaticn  must  provide  access  t'^ 
that  generalization,  so  that  instances  with  different  relevant 
features  mentioned  explicitly  can  all  ne  identified. 

3.  Generalizations  must  lead  to  memories  of  actual  events  so  that 
further  generalization  can  occur. 


These  constrainzs  suggest  a  pcossible  memory  scheme.  This  scl.emc ,  as 
implemented  in  IPP,  has  several  tree-like  structures,  each  consisting  of  rr. jre 
and  more  specific  versions  of  genera!  izati.-r.s.  The  generalizations  if.  '.lic 
tree  are  used  to  orjani-se  actual  mcmai  ios  of  The  tress  are  '.‘.'■‘d 

with  high-level  kr.ov•'led:^e  suructuzes  tT.ar.  ■■•r.  :  =  c  f.-'  d*'..  /i:.  evfn.s  •  ‘i- 

domain  at  an  intentional  level.  iP’o.-  ie’'toristn  tb..se  i extatti'i.  lud 
attacks  on  individuais- . 


A  typical  i  reo  oi  g-rneralizat  ic.jic  in  IFP':-  moTory  mi.j‘ 1. 


I'k  .•iivr.'-'th'.i ! 


like  the  next  figure. 


A  tree  of  genet ai  izat ions  sucii  as 
between  each  genercii  izution  and  i»‘;  iii 


ihc  w.ie  .n  Figure  5  mu^ipJe  h\d':;<iriq 
!f  ;iMc  ver'-iont.  Normal.'y  each 


Figure  5:  An  IPP  Generalization  Tree 


S-EXTORT 

/  \ 

G1  -  kidnappings  of  G2  -  hijackings  of 
businessmen  German  planes 

I 

V 

G3  -  kidnappings  of  businessmen 
in  Italy  by  the  Red  Brigade 
I 

V 

the  kidnapping  of  a  shoe  manufacturer 
in  Milan  in  August 

novel  feature  of  a  generalization  is  used  as  an  index  for  that  node  in  memory. 
(Some  exceptions  for  coimion  features  are  mentioned  in  [Lebowitz  80].)  So  in 
Figure  5,  generalization  G1  could  potentially  be  accessed  once  a  story  has 
been  identified  as  an  extortion  that  is  a  kidnapping  or  an  extortion  with  the 
hostage  being  a  businessman.  This  kind  of  identification  is  exactly  what  we 
need  to  do  during  the  processing  of  a  story  so  that  the  remaining  information 
in  a  relevant  generalizations  can  be  used  to  help  processing  in  the  ways 
indicated  above. 

The  processing  scheme  that  uses  such  a  memory  involves  identifying  the 
most  specific  generalizations  relevant  to  a  story  as  it  is  read,  using  any 
features  accumulated  from  the  story  along  with  the  corresponding 
generalization  index  tree.  Then  the  remainder  of  the  story  can  be  interpreted 
in  terms  of  these  generalizations.  Further,  by  having  actual  events  stored 
under  the  generalizations,  by  the  time  we  have  finished  reading  a  story  we 
have  available  similar  events  that  might  be  suitable  for  additional 
generalization. 
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similar  schemes  for  organizing  memory  li.ive  il.>o  sliown  to  be  useful  in 
explaining  reminding  phenomena  (Schank  BO)  and  hunnr.  memory  retrieval 
(Kolodner  80] . 


4  Conclusion 

Clearly  the  moivoty  scheme  devise-.i  for  IFP  scmawhat  coo  simple.  For  .nore 
complex  types  of  data  (such  as  in  the  scientific  domain  that  will  be  dealt 
with  by  RESECiRCUFR) ,  memory  will  clearly  have  to  be  more  •■strongly 
interconnected,  resulting  in  a  structure  that  is  more  a  network  that  a  tree. 
However,  the  organization  used  for  IPP  indicates  bow  the  organization  of 
memory  must  be  appropriate  for  the  process  of  knowledge  acquisition,  and  not 
just  the  retrieval  of  information. 
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ABSTRACT 


In  the  coming  decade/  a  new  generation  of 
computer-based  systems  offers  the  potential  to  do  for  the 
human  mind  what  the  industrial  revolution  did  for  human 
muscle.  To  realize  this  potential/  uie  must  study 
sophisticated  kinds  of  software/  in  which  the  computer 
performs  tasks  previously  thought  to  require  human 
intelligence.  Me  must  also  study  how  to  organize  such 
hardware/sof twarc  systems  to  interact  most  eff.«ctively  with 
their  human  masters. 

TI's  Computer  Science  Laboratory  is  attempting  to 
construct  and  evaluate  experimental  prototypes  of  such 
systems.  Their  design  has  required  unique  combinations  of 
talent  from  diverse  disciplines.  Me  arc  combining  expertise 
from  two  fields  in  particular:  artificial  intelligence  and 
human  factors  engineering.  This  talk  will  illustrate 
synergistic  effects  of  cooperation  between  these  two  fields, 
examples  will  be  drawn  from  current  research  projects  in 
natural  language  processing  and  advanced  computer  based 
instruction. 


55 


TABLE  OF  CONTENTS 


•  1.®  introduction 

2.0  INTERACTIUE  NATURAL  LAN‘..’jmoL  «:.•>  U  rLf.:5 
Z.l  Dcscr  1^:1  ion  of  thi  rrobJcrr. 

2.2  What  Human  Fictors  i.o*^tr  i hutt* ■ 

2.3  What  ArtjficiaJ  Inte  1  )  1  genet’  Centf  t&ute s 
3.0  INTELLIGEN'*  TUTORING  SY=  :E’’:;i 

3.1  Descr  \  r  t  ’  cn  of  the  P.  •hlcm 

3.2  What  Humar  Factors  Contributes 

3.3  What  Artificial  Intelligence  Contributes 
4.0  CONCLUSION 

5.0  REFERENCES 


37 


1.0  INTRODUCTION 


People  mii\  ha«.*e  trouble  performing  «  physical  task  If 
the  demands  of  the  task  exceed  their  physical  capacities. 
To  many  of  us  nouiadays*  that  seems  like  simple  common  sense. 
Houieuer*  nt  was  not  until  the  late  I890's  that  Frederick  U. 
Taylor  made  his  pioneering  studies  of  how  how  to  design  jobs 
and  tools  so  that  they  more  closely  match  the  physical 
capacities  of  people.  (Ps  an  aside#  what  Taylor  studied  was 
shouels  and  how  best  to  use  them.) 

The  field  of  human  factors  engineering  had  its  birth 
during  Morld  Uar  II.  The  founders  o'  the  field  recognized 
that  errors  can  occur  in  man-machine  systems  when  the  man's 
Job  in  these  systems  ouerloads  his  mental  capacities. 
Before  going  any  further#  let's  first  examine  what  is  meant 
by  "man-machine  system."  In  a  man-machine  system#  one  or 
more  of  the  components  is  a  person#  and  the  person  must 
interact  with  the  machine  components.  The  designs#  goals 
and  complexity  of  these  systems  uary  considerably.  Figure  1 
shows  a  schematic  of  a  simple  man-machine  system. 


Show  Foil  Number  -l-  Here. 

(Nan-machine  system  cartoon  from  Chapanis#  196S) 

During  Uorld  Mar  II  it  was  found  that  many  errors  in 
human-machine  systems#  such  as  airplane  accidents  due  to 
"pilot  error#"  could  in  fact  be  traced  to  the  design  of  the 
controls  and  displays.  These  are  the  components  of  the 
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THE  WORK  ENVIRONMENT 


lysttm  through  uihich  the  human  and  machine  components 
exchange  information.  Researchers  such  as  Alphonse  Chapanis 
and  Paul  Fitts  discovered  that  certain  control  and  display 
designs  virtually  invited  even  experienced  people  to  misuse 
or  misinterpret  them.  The  solution  lay  in  redesigning  tr  . 
controls  and  displays  so  that  they  operate  in  manner  more 
compatible  with  the  mental  capacities  of  people. 

The  TI  Computer  Science  Laboratory  develops 
human-machine  systems  in  which  the  machine  is  a  digital 
computer  whose  software  is  intended  to  be  (more  or  less) 
“intelligent."  Efforts  to  create  such  artificially 
intelligent  systems  have  been  underway  for  only  a  few 
decades;  the  founders  of  the  field  (e.g./  McCarthy  C196S]> 
Minsky  C19653>  and  Newell  &  Simon  CISTS!)  are  still  active 
contributors.  In  even  this  short  time«  much  has  been 
accomplished.  There  are  systems  that  can  play  master-level 
chess*  solve  complex  integrals*  understand  and  obey  commands 
stated  in  simple  English*  speak  in  a  human-like  voice* 
recognize  objects  in  scenes*  solve  analogy  problems*  and  so 
on.  Central  themes*  such  as  the  notion  of  a  problem  space* 
means-ends  analysis*  and  heuristic  programming  have  emerged 
to  organize  thinking  in  the  field.  AZ  software  techniques 
such  as  semantic  network  knowledge  representat ions* 
augmented  transition  networks  and  chart  parsers*  and 
production  rule  deduction  systems  have  gained  wide 
acceptance  even  as  better  approaches  appear. 
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Tht  long  ttrm  goal  of  this  work  is  to  riouelop 
**inttUigtnt  interactive  systems"  tuhich  do  for  people's 
minds  what  the  industrial  revolution  did  for  their  muscles. 
Accomplishing  this  goal  requires  combining  the  skills  of 
human  factors  engineers  and  AI  specialists.  The  purpose  of 
this  talk  is  to  describe  the  benefits  of  a  synergistic 
relationship  between  these  two  fields.  1wo  research 
projects  currently  underway  at  TI  serve  to  illustrate  those 
benefits. 

E.0  INTERACTIVE  NATURAL  LANGUAGE  SYSTEMS 

2.1  Description  Of  The  Problem 

Chapanis  (1975)  has  demonstrated  that  interactive 
natural  language  dialog  is  remarkably  unruly/  with  many 
misspellings  and  grammatical  errors.  Although  progress  has 
been  made  in  getting  computers  to  process  pristine  English 
text/  it  will  be  many  years  before  computers  will  b»  able  to 
process  unlimited  interactive  natural  language  dialog. 

As  our  group  works  toward  a  system  that  interacts  in 
true  natural  language#  another  project  is  under  way  that  is 
oriented  toward  intermediate  results.  Tfie  goal  of  this 
project  is  to  define  a  human  engineered  subset  of  natural 
language.  This  subset  would  retain  all  of  the  user-oriented 
benefits  of  unrestricted  natural  language  dialog.  However# 
its  use  would  greatly  reduce  the  processing  burden  that  true 
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natural  language  interaction  places  on  tiie  computer.  Tnis 
is  clearly  a  goal  that  can  test  be  accomplished  by 
cooperation  between  artificial  intelligence  and  human 
factors  specialists. 

2.2  Mhat  Human  Factors  Contributes 

Ford»  Meeks  and  Chapanis  (1390)  and  Michaelis  (1980) 
reported  a  series  of  experiments  that  were  conducted  in  the 
human  factors  laboratory  at  Johns  Hopkins.  In  these 
experiments^  two-person  teams  exchanged  information  over  a 
telecommunications  medium  in  order  to  solve  problems.  Half 
of  the  teams  were  rewarded  solely  for  correctly  solving 
their  problems.  The  other  half  had  their  correct  solution 
reward  diminished  for  each  word  token  they  used.  Thus, 
these  latter  teams  were  encouraged  to  keep  their 
communication  as  brief  and  concise  as  possible.  The 
problem-solving  task  assigned  to  tne  subjects  in  the 
Michaelis  experiment  is  typical  of  the  type  used  m  these 
studies:  One  team  member  was  given  a  completely  assembled 
prism-shaped  wooden  model  and  was  required  to  assist  the 
other  member,  who  had  to  build  an  identical  model  from  the 
separate  parts.  In  these  experiments,  the  team  members  were 
in  different  rooms.  In  the  Ford  ej^  j^.  study,  half  the 
teams  communicated  by  voice  and  the  other  half  via 
teletypewriters:  in  the  Michaelis  study,  all  communication 
was  over  teletypewriters. 


in  both  studies*  there  were  dramatic  and  highly 
significant  differences  betuteen  the  ttuo  experimental  groups. 
However*  it  is  important  to  note  that  problem-solving 
accuracy  was  not  affected  by  self-imposed  brevity. 


Show  Foil  Number  -2-  Here. 

(Summary  of  the  data  presented  in  the  next  paragraph.) 

Among  the  significant  differences  noted  in  both  studies 
are  that  the  self-limited  teams  generated*  on  the  average* 
about  one  fifth  as  many  word  tokens*  one  third  as  many  uiord 
types*  and  one  third  as  many  messages.  In  a  linguistic 
analysis  of  the  protocols  from  their  study*  Ford  c_t  a i . 
found  that  the  self-limited  subjects  used  proportional ly 
more  nouns  (41.9  vs.  26.  l':.  p  <  .001)*  feuer  pronouns  (5.5 
vs.  ll.9J(*  p  <  .001)*  fewer  verhs  (10.3  vs.  i6.9/C. 
p  <  .001)*  more  adjectives  (10.3  vs.  10.4^(*  p  <  .001)  and 
fewer  prepositions  (8.9  vs.  ll.3^.»  p  <  .035). 


Show  Foil  Number  -3-  Here. 

(Summary  of  data  presented  in  next  paragraph.) 

Probably  the  most  interesting  finding  of  these  studies 
is  that*  on  the  average*  the  self-limited  teams  solved  their 
problems  faster  than  their  unlimited  counterparts*  14.9 
versus  19.3  minutes  in  the  Ford  eJt.  aX*  study  and  20.5 
versus  27.6  minutes  in  the  Hichaelis  study.  This  difference 
was  not  statistically  significant  m  the  Ford  eX  aJL-  study. 
However*  in  the  Hichaelis  study*  which  tested  more  teams  f 
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Uhen  conpired  pith  the  unlimited  teams*  the  self-limited 
teams  generated: 

0  One  fifth  as  many  mord  tokens. 

0  One  third  as  many  word  types. 

0  One  third  as  many  messages. 


Mean  Percentenages  of  Parts  of  Speech  Used  by  Teams  in  the 
Tmo  Hord  Usage  Conditions,  (from  Ford*  et  al.*  1980) 


Parts  of  speech 

SelMifflited 

Unlimited 

P 

Nouns 

41.9 

26.1 

.001 

Pronouns 

5.5 

11.9 

.001 

Verbs 

10.3 

16.9 

.001 

Adjectives 

18.3 

10.4 

.001 

Prepositions 

8.9 

11.3 

.035 
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Average  Nuiber  of  Hinutes  for  leans  to  Solve  Their  Probleos 
in  Both  Experinents  and  Hord  Usage  Conditions. 


Experiient 

SelMilited 

Unliiited 

P 

Ford  et  al. 

14.9 

19.3 

N.S. 

Hichaelis 

20.5 

27.6 

<  0.005 
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ws.  32)*  tht  p  valut  was  less  than  0.085.  This  is  strong 
evidence  that  requiring  people  to  be  concise  does  not  hurt 
their  ability  to  communicate;  it  may  even  help. 

2.3  Uhat  Artificial  Intelligence  Contributes 

At  this  point*  natural  language  specialists  in  the 
Texas  Instruments  AI  group  became  involved.  They  contrasted 
the  limited  and  unlimited  protocols  from  the  Michael is 
study.  Their  goal  was  to  determine  how  the  dialog 
limitation  might  affect  the  processing  burden  of  natural 
language  computer  systems.  Tneu  were  specifically  concerned 
with  contrasting  the  effects  on  systems  that  do  a  syntactic 
analysis  first  and  then  pass  the  results  to  a  semantic 
component*  versus  those  which  integrate  the  semantic  and 
syntactic  components  during  analysis. 

Pronominal  reference  and  the  attachment  of 
prepositional  phrases*  two  stumbling  blocks  for  many  present 
syntactically  based  systems*  occur  somewhat  less  frequently 
in  the  limited  condition.  However*  in  the  limited  protocols 
over  one  third  of  the  utterances  were  ungrammatical*  while 
in  the  unlimited  case  this  was  closer  to  one  tenth.  They 
therefore  believe  that  syntax-first  approaches  will  have 
significantly  more  problems  parsing  the  limited  condition 
utterances  than  systems  which  have  less  reliance  on  syntax. 


Tht  word  types  used  m  the  limited  condition  «re 
virtuAtly  a  subset  of  those  used  by  the  unlimited  users; 
apparently#  many  of  the  words  used  by  the  unlimited  subjects 
were  not  necessary  for  the  solution  of  the  problem.  This 
finding  has  also  been  reported  in  a  study  of  -nteractiue 
1  ifflited-vocabu  1  ary  dialog  (Michaelis#  Chapams#  Weeks#  & 
Kelly#  1977)#  and  suggests  that  the  conceptual  coverage  of 
the  limited  protocols  is  less  than  that  of  the  unlimited. 
Therefore#  a  semantics  based  system#  such  as  a  semantic 
grammar  (c.f.  Burton#  1976)  or  conceptual  analyzer  (c.f. 
Sch£>.nk#  1975)#  could  possibly  gain  efficiency  from  the 
language  limitations. 

The  protocols  were  also  analyzed  to  examine  whether  the 
problem  solving  strategies  used  were  different  between  the 
unlimited  and  limited  conditions.  The  protocols  were 
classified  according  to  the  problem  solving  strategies  used 
and  the  ordering  of  their  subgoals.  No  statistically 
significant  differences  were  found  between  the  unlimited  and 
limited  conditions  in  the  number  of  teams  using  the 
different  strategies. 

In  3B  of  the  48  protocols  (nineteen  in  each  condition) 
the  subjects  used  subgoals  characteristic  of  classic 
means-ends  analyses  (Newell  %■  Simon.  1972).  These  teams 
established  two  major  subgoals  of  the  task#  building  the 
triangular  sides  and  building  the  rectangular  base.  The 
order  in  which  these  were  performed  did  not  significantly 
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difftr  bctuteen  the  limited  and  unlimited  condtiions. 

The  ten  remaining  teams  did  not  have  obvious  sub.  <als; 
six  used  an  approach  in  which  they  described  the  appearance 
of  the  model*  and  the  remaining  four  used  a  strategy  of 
making  small  pieces  and  then  connecting  these  together. 
Again*  no  significant  differences  were  found  between  the  two 
conditions  in  the  number  of  teams  using  each  strategy. 


Show  Foil  Number  -4-  Here. 

(Conclusions  from  NLP  research) 

To  summarize  the  findings  thus  far  in  this  research 
effort*  human  factors  specialists  found  no  evidence  that  the 
dialog  restriction  discussed  in  this  paper  will  hurt  the 
user's  efficiency.  Indeed*  the  llichaeiis  study  suggests 
that  the  efficiency  of  the  users  may  actually  be  improved  by 
well  chosen  limitations  on  the  interactions.  Further*  the 
language  restriction  could  not  be  shown  to  significantly 
change  the  problem  solving  strategies  used  by  the  subjects. 
The  protocol  analyses  performed  by  artificial  intelligence 
specialists  suggest  that  semanticslly  based  interactive 
natural  language  processing  systems  might  also  benefit  from 
this  restriction. 
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Conclusions 


Fron  a  huoan  factors  perspective: 

0  No  evidence  that  the  dialog  restriction  hurts  people's 
ability  to  coiiunicate. 

0  No  evidence  that  the  dialog  restriction  changes  people's 
problea  solving  strategies. 

Froa  an  A!  perspective: 

0  Soae  evidence  that  a  seaantically  based  interactive 
natural  language  processing  systea  aight  benefit  froa 
this  dialog  restriction. 


69 


3.0  INTELLIGENT  TUTORING  SYSTEMS 


A  second  illustration  of  the  AI/HF  synergism  involves 
the  development  of  "intelligent  tutoring  systems"  intended 
to  teach  elementary  computer  programing.  Such  systems 
represent  enhancements  over  conventional  "drill  and 
practice"  or  "frame-based"  multiple-choice  branching  systems 
because  they  incorporate  considerable  knowledge  about  the 
task*  the  student*  and  about  tutoring  per  se.  The  long-term 
goal  is  to  provide  a  computer-based  educational  experience 
comparable  to  a  one-on-one  interaction  with  an  expert  human 
tutor . 


3.1  Description  Of  The  Problem 

Three  systems  intended  to  teach  elementary  computer 
programming  are  examined.  The  first  system*  BIP  (for  "BASIC 
Instructional  Program")  CBarr*  Beard  and  Atkinson  19763 
serves  as  a  problem  solving  laboratory  wherein  students 


solve  programming 

exercises 

in 

the  BASIC  language. 

The 

second  system* 

SPADE-0 

(for 

"Structured  Planning 

and 

Debugging  Environment")  CMiller 

19793*  was  developed  as 

an 

AI  exercise.  Its  purpose  was  to  articulate  the  underlying 
programming  knowledge  involved  in  writing  graphics  programs 
in  the  LOGO  CPapert  19803  language.  A  third  system* 
currently  under  development*  blends  both  AI  and  human 
factors  considerations. 
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BIP  IS  an  ef f fi'.t  ive  working  computer  -based 
instructional  system  which  has  b(?en  m  use  by  the  Navy  for 
some  time,  both  as  a  research  vehicie  and  as  a  practical 
tool  for  teaching  BASIC  programming.  It  serves  as  a  useful 
representative  of  previous  work  on  programming  tutors.  (BIP 
has  been  in  use  in  our  laboratory  for  over  a  year.)  BIP  is 
able  to  individualize  the  sequence  of  tasks  encountered  by  a 
given  student,  on  the  basis  of  e  -.Kills  model  inferred  from 
performance  on  prior  tasks.  Hence.  not  every  student 
encounters  every  task;  and  sor.e  encounter  tasks  in  j 
different  order  than  others,  we  p'osent  a  (sligntly  ed’tcJ) 
excerpt  from  a  session  with  this  system  to  illustrate  its 
capabilities  and  limitations. 


COMPUTER:  On  the  first  ocy  :i  Xti.'S.  •someone's  iru.‘  lev; 

sent  him  a  pei  tricyc  fear  tree.  On  the 

second  day.  they  sent  t-uo  ti:rt!e  dovss  plus 
another  partridge  O  gifts.  2niJ  day).  This 
continued  through  the  twelfth  day.  when  ine 
true  love  sent  12  lords.  It  ladies.  ...  all 
the  way  to  yet  another  partridge.  Write  a 
program  that  computes  and  prints  the  number 
of  gifts  sent  on  that  twelfth  day.  (This  is 
not  the  same  as  the  TOTAL  number  of  gifts 
sent  for  all  12  days') 

STUDENT:  HINT? 


computer: 


This  program 
of  the  loop 
of  the  index 


requires  a  loop.  Each  execution 
involves  accumulating  the  value 
into  a  total . 


student:  hint? 


computer:  Finding  a  total  almost  always  means  you  need 
two  things: 

<1)  Setting  a  variable  to  0  before  the  loop. 
(2)  Accumulating  values  into  that  variable 
within  the  loop. 


BIP  h«s  impressible  features  for  creating  flooi  diagrciffls 
and  does  an  excellent  job  of  sequencing  tasks.  However#  its 
understanding  of  the  domain  is  limited  to  a  flat  collection 
of  language  constructs.  Based  on  informal  analyses#  BIP 
often  rejects  answers  that  students  believe  to  be  correct# 
it  tries  too  hard  to  elicit  a  single  solution#  which  is  not 
always  appropriate  in  complex  domains  such  as  programming. 

BIP  was  hampered  by  its  lack  of  understanding  or 
planning  and  debugging#  two  central  ai  concerns.  While  BIP 
could  individualize  the  sequence  of  tasks#  it  could  not 
individualize  the  hints  given  within  a  task.  Thus#  ail 
students  who  encountered  tr.t'  task  and  requested  two 
hints  would  see  the  same  two  hints  shown  above,  tq  imprc'.*-: 
upon  BIP's  pre-stored  hints#  our  problem  was  twcfcld:  to 
represent  the  underlying  knowledge  snd  to  apply  that 
knowledge  in  a  fashion  helpful  to  the  human  user. 

3.3  What  Human  Factors  Contributes 

The  goal  of  the  Ai  specialists  is  to  design 
“artificially  intelligent"  computer  environments  that  tutor 
students  in  much  the  same  way  that  a  human  teacher  might 
tutor  his  students.  The  AI  technology  has  progressed  to  the 
point  that  some  very  basic  questions  must  be  answered  before 
progress  can  continue:  What  makes  an  intelligent  human 
tutor  successful?  What  are  his  techniques  for  diagnosing 
student  problems  and  misconcept  ions'^  Whai  are  his 


techniques  for  ao^ising  students'^  In  short/  hoiu  does  he  use 
his  intelligence  to  provide  tutoring  superior  to  that 
provided  by  pre-stored  hint  systems  like  BIP'  All  of  these 
questions  relate  to  the  human-computer  interface/  so  the  AI 
specialists  at  Tl  took  the  questions  to  the  human  factors 
group . 

Job  and  task  analyses  are  two  of  the  basic  tools  of 
human  factors  engineering.  The  human  factors  group 
addressed  the  AI  specialists'  questions  by  setting  up  a 
system  in  which  a  computerized  intelligent  tutor  is 
simulated  by  having  an  intelligent  human  playing  the  role  of 
the  computer  tutor.  Very  simply,  the  human  tutor  observes  a 
student's  efforts  by  watching  a  monitor  that  is  slaved  to 
the  student's  work  terminal.  The  tutor  makes  judgments 
about  the  student's  problems  and  misconceptions/  and  =.ypes 
appropriate  help  messages  tha'  ’ppear  on  the  student's  nelp 
terminal.  It  is  important  to  recognize  that-  in  this 
paradigm/  the  human  tutor  bases  decisions  on  exactly  the 
same  information  that  would  be  available  to  the  computer 
tutor/  and  similarly  provides  help  the  same  way  that  the 
computer  tutor  should. 

In  these  studies,  the  human  tutor  is  carefully 
evaluated.  Human  factors  specialists  meticulously  record 
all  his  activities,  along  with  verbal  protocols  in  which  he 
explains  the  rationale  behind  his  decisions.  These  studies 
are  not  yet  complete,  but  a  clearer  model  of  the  intelligent 


hum<n  tutor  is  already  pii;erying.  One  .inportant  trend 
observed  thus  far  is  that  the  level  of  sophistication 
required  for  a  successful  tutor  might  not  need  to  oe 

as  great  as  was  originally  expected. 


Show  Foil  Number  -X~  Here. 

(The  following  paragraphs*  including  the  BASIC  code.) 

Here  is  an  example  of  a  problem  a  student  had  that  was 
easily  diagnosed  by  the  human  tutor.  The  student  was 
learning  how  to  program  m  BASIC*  using  the  BIP  problem  set. 
In  this  particular  problem*  the  student  was  asked  to  take 
two  numbers*  M  and  N*  and  compute  their  sum*  difference* 
product*  and  quotient.  This  is  what  the  student  typed: 

10  PRINT  -WHAT  IS  THE  FIRST  NUMBER” 

20  INPUT  M 

30  PRINT  "WHAT  IS  THE  SECOND  NUMBER” 

40  INPUT  N 
50  LET  A  =  M  4  N 

60  LET  B  =  M  -  M 

70  LET  C  i  M  *  N 

03  LET  D  r  M 

At  this  point*  the  student  pauses  fci  tver  a  minute - 
then  asked  for  help.  Quite  ciedrly.  the  student's  p'-ohlem 
was  that  he  did  not  know  the  symbol  for  division.  This  sort 
of  problem  is  representative  of  the  type  solved  Oy  the  human 
tutor  that  would  not  have  been  solved  by  a  pre-stored  hint 
tutor  like  BIP.  Note  that  even  a  very  simple  means-enas 
analysis  model  involving  sequential  accomplishment  cf 
subgoals  is  adequate  to  provide  a  correct  hint  here. 
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The  student  gas  asked  to  grite  a  BASIC  prograo  that  gould 
take  too  nunbersi  N  and  Ni  and  coopute  their  sufli 
difference)  product)  and  quotient.  Here  is  ghat  he  did: 


10 

PRINT 

"WHAT 

IS 

THE 

FIRST  NUMBER" 

20 

INPUT 

M 

30 

PRINT 

"WHAT 

IS 

THE 

SECOND  NUMBER 

40 

INPUT 

N 

50 

LET  A 

:  M  + 

N 

60 

LET  B 

:  11  - 

N 

70 

LET  C 

:  M  * 

N 

80 

LET  D 

:  H 

Uhen  he  got  to  this  point)  the  student  paused  for  over  a 
Ainute)  and  then  asked  for  help.  Uhat  information  does  he 
need  in  order  to  continue? 
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3.3  Whit  Artificial  Intelligence  Contributes 


The  crucial  cont r i but i ops 

of  AI  to 

CAI 

derive  from 

represent ing 

the  under  lying 

knowledge . 

In 

the  case  of 

programming# 

representing  the 

oomain 

knowledge  requires 

asking  such  questions  as#  "What  is  it  that  the  expert 
programmer  knows  that  the  nouice  does  not?"  Miller's  SPADE-0 
project  was  more  an  attempt  to  investigate  ana  formalize 
this  type  of  knowledge  than  to  build  a  useful  programming 
tutor.  It  represented  knowledge  about  programming  plans 
(i.e.»  procedural  templates  independent  of  the  particular 
programming  language)  and  debugging  techniques. 

3PADE-0  built  upon  A1  work  in  automatic  planning  and 
debugging  developed  in  HACKER  tSussman  1973],  tiVCROFT 
CGoidstein  1974],  and  NOAH  CSacerdoti  19753.  spade-3  coulo 
prompt  the  student  through  hierarchical  planning  processes, 
encouraging  the  student  to  postpone  premature  commitment  to 
the  detailed  form  of  the  code.  (This  AI  planning  technique 
grew  out  of  such  systems  as  ABSTRI'^S  Crefl.)  SPADE-0 
provided  a  vocabulary  of  concepts  for  describing  plans, 
bugs*  and  debugging  techniques#  and  handled  the  routine 
bookkeeping  tasks  involved  in  simple  program  development. 

Figure  XX  illustrates  a  sample  interaction  with 
SPADE-8.  The  key  feature  is  the  system's  deeper  analysis  of 
the  underlying  knowledge.  This  is  manifested  by  commands 
for  editing  the  plan  —  rather  than  merely  the  code  —  of 
the  student's  program.  However#  tne  design  of  SPADE-0 
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ignored  humen  factors  considerations*  imposing  its  own 
technical  uocabulary  on  the  student*  and  adopting  a  style  of 
interaction  that  took  away  much  of  the  initiative. 

Our  current  work  is  an  attempt  to  extend  the  underlying 
AI  knowledge  represented  by  SPADC-0  and  merge  it  with  the 
improved  human  factors  guidelines  resulting  from  careful 
analyses  of  what  good  human  tutors  do.  Like  BIP*  it  will 
dynamically  select  tasks  from  a  curriculum  database;  but 
like  SPADE>e*  it  will  build  a  model  of  the  student's  problem 
solving  skills  <rather  than  simply  recording  which 
programming  language  constructs  have  been  mastered).  The 
key  AI  aspect  is  fine-grained  diagnosis  of  student  errors  to 
provide  custom-generated  (rather  than  pre-stcred)  advice. 

Ue  are  basing  the  design  of  our  new  tutoring  module  on 
human  factors  studies  in  which  a  human  simulates  tnis 
module.  As  the  system  implementation  progresses*  additional 
tasks  will  be  taken  over  by  the  computer*  and  the  need  for 
the  human  tutor  to  intervene  will  be  correspondingly 
diminished.  The  proportion  of  tasks  successfully  performed 
by  the  computer  tutor  is  a  measure  of  our  progress. 

Earlier  “intelligent  tutoring  systems"  such  as  BIP  and 
SPAOE-0  used  their  intelligence  to  build  models  of  the 
student.  However*  the  interface  between  the  intelligent 
tutor  and  the  student  remained  crude.  By  working  with  human 
factors  engineers*  the  AI  specialists  now  better  understand 
how  human  tutors  interact  with  students.  The  emphasis  of 


79 


DFC  SYSTEM  20 


80 


COHPARISON  OF  HINTS  FROM  ‘DUHB*  TUTOR  AND  'SMART*  TUTOR 
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the  A1  work  h«s  now  shifted  to  modelling  this  tutor/'student 
interface. 

4.0  CONCLUSION 

In  closing*  it  is  utorthwh i  le  to  review  a  central  human 
factors  problem:  the  division  of  labor  between  human  and 
machine  in  human-machine  systems.  In  any  we  1 1 -des igned 
system*  tasks  are  allocated  to  those  components  best  suited 
to  perform  them.  Textbooks  on  human  factors  engineering 
typically  state  that  machines  tend  to  be  superior  to  humans 
in  such  tasks  as  calculation  and  coordination  of  many 
simultaneous  activities.  Conversely*  they  state  that  humans 
tend  to  excel  in  such  tasks  as  problem  solving  where 
originality  is  required*  pattern  recognition*  and  decision 
making  based  on  incomplete  or  conflicting  data*  or  when 
unlikely  or  unexpected  events  occur.  Thus*  these  guidelines 
would  allocate  responsibility  for  calculation  to  the 
machine*  but  leave  the  human  responsible  for  recognizing 
patterns  in  the  results  of  those  calculations. 

As  artificial  intelligence  continues  to  progress* 
machines  will  begin  to  achieve  superiority  over  humans  m 
many  aspects  of  tasks  traditionally  assigned  to  humans. 
This  might  lead  to  speculation  that  research  on 
human-machine  interfaces  may  be  unnecessary*  since  the  need 
for  the  human  component  wilt  disappear.  ror  certain  kinds 
of  menial  tasks  presently  performed  by  humans*  this  line  of 
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HUHANS  ARE  BETTER  AT: 


MACHINES  ARE  BETTER  AT: 


PATTERN  RECOGNITION 

APPLYING  ORIGINALITY  IN  SOLOING 
PROBLEMS 

MAKING  DECISIONS  BASED  ON 
INCOMPLETE  OR  CONFLICTING  DATA 

MAKING  DECISIONS  WHEN  UNLIKELY 
OR  UNEXPECTED  EUENTS  OCCUR 


ACCURATELY  AND  RAPIDLY  PERFORMING 
COMPLEX  CALCULATIONS 

COORDINATING  AND  PERFORMING  MANY 
SIMULTANEOUS  ACTIUITIES 

PERFORMING  ROUTINE  OR  REPETITIUE 
TASKS 

MONITORING 
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rtatoning  is  probably  sound.  Houieuer/  it  is  our  expectation 
that*  as  work  in  artificial  intelligence  and  human  factors 
engineering  continues  to  advance*  the  nature  and  power  of 
the  human-computer  interface  will  become  more  critical  and 
sophisticated.  The  art  and  science  of  interface  design  will 
never  become  obsolete.  Obsolescence  is  faced  only  by  our 
traditional  task-al locat ion  guidelines. 

This  paper  has  described  two  examples  of  research 
projects  in  which  AI  and  human  factors  specialists  have 
collaborated.  From  these  projects  and  others  like  them*  we 
have  learned  to  stop  thinking  in  terms  of  separate 
disciplines  that  merely  benefit  from  cooperation. 
Particularly  in  the  design  of  "intelligent  interactive 
systems*"  the  borderline  between  these  two  fields  has 
blurred  in  our  eyes.  Human  factors  specialists  are  learn.ng 
to  exploit  the  tremendous  benefits  for  the  human  component 
made  possible  by  more  intelligent  software  components;  PI 
specialists  are  learning  to  write  software  that  is  sensitive 
to  the  needs*  capacities*  and  limitations  of  the  human 
component.  Due  to  this  kind  of  synergism*  the  well-designed 
human-computer  interface  can  become  a  link  between  the 
creative  thoughts  of  men  and  machines*  contributing  to  a 
technological  revolution  that  offers  to  do  for  the  hut.  ..i 
mind  what  the  industrial  revolution  did  for  human  muscle. 
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OVERVIEW  OF  SELECTED  DISPLAY  FORMATTING 
AND  CLUTTER  REDUCTION  TECHNIQUES  ^  ♦  2 


Franklin  L.  Moses 
Human  Factors  Technical  Area 

US  Army  Research  Institute  for  the  Behavioral  and  Social  Sciences 

Alexandria,  VA 


System  and  software  designers  for  graphic  applications  have  a  real 
dilemma.  Designers  often  are  given  the  type  of  symbols  to  be  displayed, 
the  amount  of  Information  to  be  portrayed,  and  the  hardware  to  be  used.  If 
they  cannot  change  the  symbols,  reduce  the  data,  or  replace  the  hardware, 
what  can  be  done  to  make  a  display  speak  to  the  user  with  the  clarity 
desired?  One  solution  is  to  format  the  information  so  that  the  display  is 
compatible  with  the  user’s  perceptual  abilities  and  task  requirements. 

The  essence  of  such  formats  is  to  highlight  information  relevant  to  a  task 
and  thereby  make  it  stand  out  from  the  irrelevant  Information. 

The  goal  of  creating  **good**  displays  is  to  present  information  so  that 
user  needs  can  be  satisfied  quickly  and  efficiently.  However,  one  problem 
created  by  adding  more  information  to  a  display  screen,  even  if  it  is  rele¬ 
vant  to  the  user,  is  generally  called  clutter.  For  the  sake  of  discussion, 
clutter  exists  when  the  extraction  of  information  from  a  display  is  hindered 
by  the  density  or  similarity  of  symbols.  A  number  of  alternative  formatting 
techniques  can  be  suggested  to  reduce  clutter.  Of  course,  some  methods 
will  work  better  than  others,  depending  on  the  situation* 

Although  the  examples  of  formatting  in  this  paper  all  relate  to  Army 
applications,  the  principles  should  easily  generalize.  Army  representations 
of  the  battlefield  illustrate  a  classic  problem  for  displays:  or  users 
try  to  display  more  information,  they  end  up  extracting  less  due  to  clutter. 
Formatting  guidelines  are  needed  to  help  reduce  the  clutter  problem. 

Formatting  Situation  Displays 

Figure  1  is  a  typical,  albeit  ficticious.  Army  battlefield  map.  Anyone 
who  has  seen  a  real  one  will  recognize  this  one  as  a  severely  stripped  down 
version.  It  shows  only  the  most  essential  information:  terrain  (mountains, 
rivers,  roads  and  forests);  the  unit  type  (artillery,  infantry,  armor);  and 
the  unit  sizes  (division,  brigade,  and  battalion).  Yet,  it  already  is  clut¬ 
tered.  Consider  the  time  and  effort  that  a  person  would  need  to  compare  the 
number  of  armor,  artillery,  and  infantry  units,  even  on  such  a  simplified 
display.  Alternative  formats  using  the  same  symbols  and  the  same  information 
can  help  to  make  such  tasks  easier  for  the  user.  Several  suggestions,  based 
on  Army  Research  Institute  (ARI)  work,  should  allow  more  information  to  be 
meaningfully  displayed  without  adding  hardware  costs  or  decreasing  user 
performance. 


^An  earlier  paper  by  Leon  H.  Gellman  (currently  at  Sarah  Lawrence  College,  N.Y.) 
was  presented  at  the  US  Army  Second  Computer  Graphics  Workshop,  Virginia  Beach, 
VA,  September  1979,  and  used  as  a  basis  for  the  current  report. 

2 

The  views  expressed  by  the  author  do  not  necessarily  reflect  the  views  of  the 
US  Army  or  of  the  US  Department  of  Defense. 
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Redundant  Codes 


The  first  formatting  technique  to  be  discussed  is  based  on  the  re¬ 
search  of  Vicino,  Andrews  and  Ringel  (1965) •  They  doubly  or  redundantly 
coded  Information  on  a  battlefield  display,  thereby  allowing  users  two 
chances  to  find  the  information.  Redundant  coding  takes  information  which 
is  already  on  the  display  and  repeats  it  in  a  salient  code  that  helps  the 
user  to  organize  the  display.  For  example,  Figure  2  presents  the  map  with 
redundantly  coded  unit  symbols.  The  code  is  the  heavy  broken  line  for 
artillery,  the  heavy  rectangle  for  armor  and  the  heavy  X  for  infantry. 

There  is  no  more  or  less  information  here;  rather,  there  are  two  ways  of 
identifying  the  units.  The  double  code  has  been  used  to  maximize  the  saliency 
of  unit  types  making  similar  units  seem  to  stand  out  together.  When 
Vicino  et  al.  used  this  technique,  they  increased  the  speed  of  information 
extraction  by  97%  when  compared  with  a  s5.ngle  code.  Redundant  codes  will 
not  necessarily  increase  processing  speed  this  much  in  all  situations. 

However,  processing  should  be  easier  and  the  cost  of  such  formatting  is 
minimal.  Redundant  coding  can  be  done  with  any  number  of  stimulus  dimensions 
such  as  blinking,  size,  intensity  and  color. 

Sequential  Formats 

Sequential  Presentation  by  Topographic  Segments.  So  far,  the  discus¬ 
sion  has  centered  on  using  codes  to  organize  display  content.  If  a  display 
has  to  show  a  lot  of  detail,  then  a  second  type  of  format,  called  sequential 
presentation,  organizes  the  information  by  breaking  it  up  into  component 
parts.  This  is  accomplished  by  showing  information  in  segments  over  time. 
Sequential  presentation  reduces  clutter  by  showing  less  information  per 
screen  and,  for  similar  reasons,  it  increases  the  amount  of  detail  that  users 
can  see.  The  technique  is  particularly  useful  for  showing  standard  topo~ 
graphic  information  that  easily  exceeds  state-of-the-art  display  resolution 
capabilities* 

Sequential  formats  require  users  to  depend  on  their  ability  to  inte¬ 
grate  information  over  time.  Thus,  an  important  formatting  question  con¬ 
cerns  whether  to  display  segments  of  an  entire  map  by  scanning  them  or  by 
sequentially  presenting  static  (l.e.,  discrete)  views.  Based  on  an  ARI 
experiment  by  Moses  and  Maisano  (1979),  static  views  with  overlaps  of 
around  25%  are  more  efficient  for  users  than  continuous  scanning  methods 
of  sequential  map  presentations.  When  resolution  and  clutter  are  serious 
problems,  sequential  presentation  should  be  considered  as  a  solution. 

Sequential  Presentation  by  Data  Dimension.  The  final  formatting 
technique  to  be  discussed  is  also  a  sequential  presentation  method,  but 
this  one  displays  information  by  data  dimensions.  The  idea  is  once  again 
to  segment  information.  This  is  accomplished  by  presenting  a  limited  num¬ 
ber  of  data  dimensions  simultaneously  while  removing  other  information  from 
the  screen.  Of  course,  questions  such  as  how  many  separate  data  dimensions 
can  be  shown  per  screen  and  what  is  the  effect  of  user  control  over  selection 
of  dimensions  need  to  be  considered.  These  and  other  inquiries  about  sequen¬ 
tial  presentation  are  toj-ics  for  possible  future  Investigation  at  ARI. 


89 


Summary 


This  paper  discusses  the  problem  of  putting  too  much  information  on 
a  display  and  outlines  four  formatting  techniques  which  may  alleviate 
some  effects  of  clutter.  The  suggested  formatting  techniques  are  only  a 
few  of  many  methods  available  to  the  graphic  system  designer.  The  question 
that  remains  is:  Which  format  should  be  used?  The  answer  can  only  be  found 
by  determining  the  format  that  optimizes  task  performance  for  display  users. 
Clearly,  none  of  the  recommendations  made  here  will  provide  an  unconditional 
solution  to  graphic  problems.  However,  it  is  incumbent  upon  the  designer 
and  programmer  to  use  every  trick  at  their  disposal  to  provide  graphics 
which  have  the  impact  and  clarity  commonly  believed  possible.  The  Workshop 
presentation  will  consider  this  goal  in  more  detail  along  with  some  guide¬ 
lines  for  attaining  it. 
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FORMAL  GRAMMAR  REPRESENTATION  OF  MAN-MACHINE  INTERACTION 


Phyllis  Reisner 
IBM  Research 
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San  Jose,  CA  95193 

End  users  communicate  with  a  computer  system  by  using  a  language.  The 
language  might  be,  for  example,  a  query  language,  a  natural  language,  or 
an  "action  language"  -  a  sequence  of  button  presses,  typing  actions, 
cursor  or  lightpen  actions,  etc.  These  user  input  languages  can  be 
represesented  in  the  same  way  as  any  other  language  •  by  a  formal  grammar 
which  shows  the  permitted  strings  and  also  shows  the  structure  of  the 
language. 

The  work  to  be  described  in  this  talk  attempts  to  use  a  formal  description 
of  the  user  input  language  as  a  design  tool  to  improve  the  ease-of-use  of 
a  man-machine  interface.  The  talk  will  first  describe  earlier  work,  which 
uses  a  BNF-like  grammar  in  the  context  of  a  color-graphics  system  for 
making  slides.  It  will  then  discuss  current  work  using  a  formal  grammar 
to  describe  text  editing.  The  current  work  is  first  attempting  to  make 
s  ime  of  the  concepts  introduced  informally  in  the  earlier  work 
sufficiently  precise  that  people  with  a  variety  of  backgroxinds  can  use 
them. 

The  field  of  human  factors,  which  attempts  to  measure  and  improve  the 
ease-of-use  of  products,  is  largely  experimental.  It  uses  techniques  of 
behavioral  science  as  its  prdLmary  methodology.  The  intent  of  the  work 
with  the  color-graphics  system  was  to  demonstrate  that  a  formalism  could 
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be  applied  in  this  area  which  is  usually  considered  soft,  or  even  ad  hoc. 
The  intent  was  also  to  explore  the  possibility  of  using  the  formalism  to 
compare  alternative  designs  for  ease-of-use  and  to  located  design  flaws 
that  might  cause  user  problems.  We  wanted  to  see  whether  a  tool  could  be 
developed  that  had  some  predictive  potential.  One  problem  with  the  usual 
behavioral  approach  to  interface  design  is  that  it  must  frequently  await 
the  existence  of  a  prototype  or  working  model.  We  wanted  to  augment  this 
approach  with  a  more  analytic  one. 

The  color-graphics  system,  ROBART,  existed  in  two  versions,  ROBART  1, 
which  was  designed  without  explicit  attention  being  paid  to  ease-of-use, 
and  ROBART  2,  a  redesigned  version  with  the  end-user  a  major  focus  of 
attention.  It  was  an  experimental,  interactive  system  for  creating  slides 
for  technical  presentations.  It  was  intended  to  be  used  by  people  without 
computer  training  doing  non-routine  tasks.  The  function  available  in  both 
versions  was  essentially  the  same,  br  »*he  design  of  the  human  interface 
differed. 

To  explore  the  issues  discussed,  the  ”  action  language”  of  the  first 
version  was  described,  using  a  BNF-like  notation.  (In  this  action 
language,  the  user  selected  colors  by  dipping  a  cursor  into  a  paintbox  of 
colors  on  a  CRT  screen  by  using  a  joystick,  selected  shapes  such  as  lines, 
circles,  rectangles,  etc.  by  verious  combinations  of  switch  selections 
and  button  presses  on  an  external  switchbox,  indicated  the  location  and 
orientation  of  the  shapes  by  combinations  of  cursor  positioning  and  button 
presses.  It  was  also  possible  to  type  textual  material  on  the  screen,  in 
color).  Portions  of  the  action  language  for  ROBART  2  were  also  described, 
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also  using  the  BNF-like  notation* 


The  next  step  was  to  make  predictions,  from  these  formal  descriptions, 
about  very  specific  differences  in  the  ease-of-use  of  the  two  versions, 
and  then  to  test  the  predictions  to  see  if  they  were  in  fact 
substantiated.  The  goal  was  to  see  if  formal  grammar  could  be  used  as  a 
predictive  tool  and  if  the  predicted  differences  were  indeed  measurable. 

This  did  indeed  turn  out  to  be  the  case.  Among  others,  we  predicted  that 
the  action  of  selecting  shapes  would  be  more  difficult  in  ROBART  1  than  in 
ROBART  2,  for  each  of  the  shapes  available.  We  also  predicted  that  users 
would  make  a  particular  error  in  "initiating”  shapes  (the  first  action  to 
indicate  location  and  orientation)  in  ROBART  1  and  would  not  make  an  error 
in  the  same  step  for  ROBART  2.  Since  the  same  error  was  not  expected  to 
occur  in  ROBART  2,  we  felt  that  the  problem  would  indeed  be  attributable 
to  the  interface  design  and  was  not  inherent  in  the  function  itself. 

In  an  exploratory  experiment  with  temporary  office  workers,  the 
predictions  were  in  fact  substantiated. 

Current  work,  in  the  context  of  text  editing,  is  first  attempting  to 
clarify  some  of  the  concepts  and  techniques  used  in  the  above  work.  The 
concepts  were  intuitive,  but  not  precise  enough  to  develop  into  a  design 
tool  to  be  used  by  a  variety  of  people  with  different  backgrounds.  For 
example,  we  introduced  the  notion  of  a  "cognitive"  terminal  symbol,  since 
we  thought  that  what  the  naive  user  has  to  learn  and  remember  will  be  of 
major  importance  in  the  ease-of-use  of  a  system  he  uses  intermittently. 
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This  notion  clearly  needs  to  be  made  more  precise.  We  also  used  a 
quasi-automatable  technique  for  locating  structural  inconsistencies  in 
the  language.  We  expected  these  structural  inconstencies  to  cause  users  to 
make  mistakes.  Neither  the  notion  of  "structural  inconsistency"  nor  the 
technique  have  been  made  explicit.  These  and  other  related  issues  will  be 
discussed. 
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A  RULE  BASED  HELP  SYSTEM  FOR  SCRIBE 


ELAINE  RICH 
AARON  TENIN 

26  February  1961 
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People  need  access  to  help  It  they  are  golna  to  use  complex 
computer  systems  effectively*  There  will  not  always  oe  other 
people  or  even  manuals  arouno  to  help  them.  So  we  need  tne 
computer  Itself  to  be  able  to  provide  the  nelP  Its  users  need* 
This  is  not  a  new  arqument*  See*  for  example  *  tPirtle  681, 

The  extent  to  which  anyone  can  help  someone  else  Is  limited  by 
the  depth  of  the  helper's  own  Knowledge,  So  if  computers  are 
going  to  help  people*  they  must  nave  a  great  deal  of  Knowledge 
about  wnat  tney  do, 

but  tne  usefulness  of  nelo  information  to  a  person  seeKing  help 
is  a  direct  function  of  tne  extent  to  which  the  Information 
answers  the  specific  question  the  user  nad.  So  simply  dumping  an 
entire  manual  or  even  large  chunKS  of  It  on  a  user  every  time  he 
asKs  a  Question  Is  useless. 

People  Who  need  help  are  missing  some  information  about  how  the 
system  *orks*  So  they  cannot  be  counted  on  to  describe  tnelr 
problem  in  terms  of  specific  system  commands  so  that  the  relevant 
parts  of  tne  manual  can  be  found  and  fed  back  to  tnem,  (This 
precludes  simple  keyword  based  help  systems  such  as  [Shapiro  1b] 
or  iKenler  601,) 

These  obvious  tacts  force  us  to  the  conclusion  that  to  provide 
a  good  interactive  help  facility  will  require  a  laroe  data  base 
of  knowledge  about  the  operation  of  the  system  in  question.  This 
data  base  must  he  structured  In  suen  a  *ay  that  it  can  be 
accessed  from  descriptions  at  a  variety  of  levels  about  what  the 
program  did  and  what  the  user  wanted.  To  investigate  the  issues 
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raised  by  sucr>  constraints,  »e  are  building  a  help  system  for  the 
docu'iient  formatting  prograo^  Scribe  IKeid  80), 

me  knowledge  base  used  ny  tne  system  is  a  set  of  rules  that 
describe  Serine's  oenavlor  at  a  variety  of  levels.  Too  level 
rules  describe  the  behavior  of  the  system  in  terms  of  fairly  high 
level  functions,  other  rules  then  describe  tnose  functions  in 
terns  of  lower  level  functions,  and  so  tortn,  we  plan  Initially 
not  to  try  to  provide  rules  that  describe  scribe  down  to  the 
lowest  level,  at  which  individual  characters  are  placed  on  tne 
page.  This  wHi  of  course  limit  the  ability  of  the  system  to 
answer  auestlons  about  that  aspect  of  the  system's  performance, 
But  tots  is  analogous  to  tne  situation  that  occurs  with  human 
consultants.  There  comes  a  point  wnere,  unless  they  are  familiar 
wltn  toe  details  of  the  coae  of  the  system,  they  simply  cannot 
answer  a  question,  this  rule  cased,  successive  decomposition 
approach,  however,  prevents  us  from  being  locked  into  a 
particular  level  of  aescription,  Aiew  rules  that  provide 
additional  levels  of  description  can  oe  added  at  any  time, 

each  rule  in  tne  system  contains  a  left  side  that  describes 
when  it  can  be  Invoked,  and  a  right  side  that  describes  the 
sequence  of  actions  that  will  result,  the  left  side  consists  of 
two  carts,  a  command  or  a  piece  of  the  input  file,  which  tries  to 
trigger  tne  rule,  and  a  list  of  auxiliary  conditions  that  must  be 
met  In  order  for  the  rule  to  be  aole  to  oe  Invoked,  For  example, 
the  foilowlno  rules  describe  how  Scribe  orocesses  tne  tarefCarg) 
co-n.iand,  which  substitutes  tor  the  string  "tarefCarg) " ,  tne 
reference  indicated  by  tne  string  arg,  (Commands  to  Scribe  are 


98 


signalled  by  the  character 


1  ^ref(arg)  and  loolcupsynboltao’ eCarg)  NKU  i)  -> 

send  (main text »  looxupsyiTiholtable(arg} ) 

2  @ref(arg)  and  loo<upauxtile(arg)  NtSQ  0  «•> 

send  ('TiaintextflooKupaux  tile  (arg) ) 

3  #re£(arq)-> 

send(iiialntext»«c(arg) ) 

send (err or file, "undefined  reference" »arg) 


The  order  of  tne  rules  in  the  data  base  reflects  the  order  in 
which  Scribe  cnecics  for  things,  in  this  example.  Rule  i  says 
that  If  tnere  is  a  ref  command  with  a  particular  argument  and  if 
there  is  an  entry  in  the  internal  symool  table  indicating  a 
previous  definition  of  that  argument,  tnen  print  in  tne  outout 
the  approorlate  value  as  indicatea  by  the  definition,  otherwise, 
if  there  is  a  definition  of  the  argument  in  the  AUX  file  (a  file 
containing  the  sympol  table  that  was  built  tne  last  time  Scribe 
processed  this  file)  then  use  that  definition.  It  there  was  no 
definition  in  either  place,  then  simply  Insert  into  the  text  the 
string  that  was  the  argument  to  ref.  but  capitalize  it.  Also 
maKe  a  note  of  this  error  in  tne  error  log  file. 


The  actions  indicated  by  these  rules  are  fairly  high-level. 
They  inalcate  that  text  should  be  placed  in  output  files.  They 
do  not  indicate  how.  They  do  not  specify  such  things  as  the 
margins  or  the  type  font  to  be  used.  Those  things  are  specified 
in  the  rules  tnat  descrioe  the  operation  of  the  send  function. 
Some  of  the  actions,  sucn  as  send,  can  only  oe  generated  oy  the 
operation  of  otner  rules,  others,  sucn  as  @c(text).  could  also 
have  occurred  in  tne  input  file.  The  fact  that  tne  Scribe  system 
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Is  very  well  structured  makes  It  easy  to  describe  tne  operation 
ot  one  function  in  ter^n*  ot  a  well  defined  set  of  otner 
functions.  This  one-step-at-a*tlme  description  is  very  important 
for  tne  gener'^tion  of  responses  to  user's  questions,  no  one 
wants  a  bit  level  answer  to  every  question  they  ask.  People 
usually  want  a  description  in  terms  one  or  perhaps  two  levels 
hlaher  or  lower  than  tne  level  at  which  they  asked  the  question. 

The  set  of  rules  provides  a  static  description  ot  the  way 
operations  In  Scribe  arc  oerformed  in  terms  ot  other,  lower  level 
operations.  As  Scribe  executes,  it  builds  a  separate 

hierarchical  structure  tnat  reflects  tne  block  structure  of  tne 
specific  document  that  is  oelng  orocessed.  For  example,  a 
document  could  contain  tne  sequence: 

abeqlnC quotation) 

t  •  • 

0beq in (Itemize) 

9  9  9 

i3end(  itemize) 

•  •  • 

(iend(ouotatlon) 

The  quotation  environment  specifies  that  the  margin  should  be 
moved  in  and  that  the  text  should  be  printed  slnqle  spacea.  The 
lter.'.ize  environment  soeclfles  tnat  the  margins  snould  oe  moved  in 
and  that  paraqraons  snould  oe  numoered,  Tnese  specifications 
nest,  so  tnat  the  margins  inside  tne  itemize  will  oe  narrower 
than  for  the  rest  of  the  quotation,  wnicn  will  be  narrower  than 
the  surrounding  text, 

JO  answer  a  user's  questions,  tne  help  system  will  match  nieces 
of  tne  user's  question  against  pieces  of  rules,  and  use  unmatcned 
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pieces  of  tne  r>ues  or  Patterns  of  chaining  tnrough  the  rules  as 
answers  to  the  ciucscaons,  '-any  questions  can  oe  ans<*erea  oy 
referrinu  only  to  toe  static  oescription  of  Scrloe's  operation, 
Ho«'ever,  when  a  question  refers  to  soTething  specific  tnat 
hepceneu  at  a  particular  point  in  tne  user's  file,  it  maY  ce 
necessary  for  the  help  systeir  to  polio  a  Piece  of  the  dynamic 
tree,  mirroring  that  ouilt  uy  Scrloe  durlnq  execution,  so  that  it 
wltl  <no*  enough  context  to  t-c  aoXe  to  identify  the  rules  that 
were  apr-llcd. 

One  of  tne  most  cof.mon  tyoes  of  questions  a  help  system  must 
ans-^er  is  "Koy  din  x  occurf”,  fnls  usually  means  tnat  the  user 
expected  that  somethino  else  would  occur.  To  answer  such 
questions,  the  neJn  system  finds  tne  rules  whose  right  hand  sloes 
Specify  the  etlcct  the  user  has  descrloed.  Let's  assume,  tor 
slJU'llclty,  tnat  t>iere  is  exactly  one  such  rule,  no*  a 
suoertlclal  answer  to  tlie  question  is  simply  to  state  the  left 
side  of  tnat  rule.  .lut  much  of  what  is  there  is  usually 
redundant.  For  example,  the  user  knows  wnat  command  he 
specified,  what  tne  help  system  will  do  is  to  compare  the  rule 
it  toun<i  to  others  whose  left  sides  are  different.  The 
differences  in  the  left  sides  are  the  specific  reasons  why  the 
observed  effect  occurred,  ratner  than  some  other,  so,  for 
example.  If  the  user  asks  why  nls  !?ref  commano  resulted  in  the 
laoel  and  not  tne  thing. to  wolch  it  referred  being  printed,  the 
system  oDserves  tnat  tnls  happenea  oecause  the  label  was  not 
previously  defineo.  It  concluoed  this  oy  comparing  Rule  3,  tne 
one  tnat  •describes  wriat  Scrioe  c!i(4,  to  Roles  i  and  2,  wnlcn 


101 


describf*  A»Mdt  tt  '•jouAH  'i<ive  done  if  tnln?s  n<ja  been  sliobtly 
dl f  terent » 

So.'ietlnes  there  i-iy  oe  a  oreat  niany  rules  i»nose  lett  sloes 
almost  fi'accn  the  selected  rule.  It  may  tnen  be  necessary  Cor  the 
helr'er  to  asK  tnc  user  what  he  expecteo  to  have  happen.  Then 
only  tne  rules  whose  rtent  sides  n-aten  that  expected  action  need 
to  be  considered,  /deally  tne  syste-i  would  loalntaln  a  good  nodei 
of  tne  user  so  tnat  such  nuestions  would  rarely  need  to  be  asxed, 
Sor.>etlmes  general  Kno*leaoe  aoout  tne  way  peooie  use  tne  systei?) 
xlli  nelc-  nere,  for  exar-pie,  people  usually  expect  some  fairly 
direct  connection  between  tne  coi'-nands  they  Issue  and  tne  results 
they  see,  Jney  rarely  expect  a  command  to  oe  a  no-op.  But  there 
will  always  dc  limes  woen  an  Individual  has  an  Idiosyncratic 
•nlsunderstanding  or  tre  system  and  nothing  short  of  a  direct 
luestlon  will  point  this  out.  For  tnis  reason,  tne  process  of 
answering  a  guestjor.  -’ust  oe  thouont  of  as  a  olaiogue  rather  than 
as  a  one-shot  ouestion  and  answer, 

Arotner  comiron  type  of  question  Is  what  (ienesereth  IGeneseretn 
/HI  calls  the  "»*o*.JO'‘  question.  For  exanpie,  "How  do  I  get  my 
toocootes  to  co'iie  out  at  the  enn  of  my  docuirent  rather  than  at 
the  end  of  each  pav^e?",  riowdo  questions  are  answered  oy  matching 
the  user's  descrlntlori  of  wnat  ne  wants  to  do  against  the  rignt 
si jes  of  tne  rules  to  find  tnose  that  can  produce  the  desired 
effect,  if  there  ate  tore  tnan  one,  tnen  the  cnoicc  among  them 
wli.1  »e  made  oy  conslderinu  suen  things  as  the  complexity  of  tne 
constructs  Involved  ana  tne  user's  level  of  expertise  witn  the 
sys*^e-.i,  Tne  left  slue  ot  tne  chosen  rule  describes  what  Is 
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necessary  to  accoir.ciish  tne  deslrea  effect.  Hut  It  may  contain 
conditions  that  tne  user  cannot  soeclfy  directly.  So  the  help 
systenr  nijst  chain  oacKivarcls  tnrouoh  the  rules  to  find  the 
commands  that  will  Cause  toose  conditions  to  be  true. 

Vet  another  copimon  type  of  inouiry  is  the  ’’what  is  the 
difference  hetveen"  question.  For  example,  a  Scrloe  user  might 
ask,  "wnat  is  the  difference  between  the  Itemize  and  enumerate 
commands?",  ihese  questions  Can  oe  answerea  easily  ny  tnis  kind 
of  rule  based  system  wltnout  navlno  been  anticipated  in  advance. 
It  need  merely  find  tne  rules  that  descrloe  tne  operation  of  eacn 
commend  by  natching  against  left  sides,  in  toe  simple  case, 
there  will  re  one  rule  for  each  and  the  answer  to  tne  question  Is 
simply  a  list  of  the  differences  netween  the  corresconalng  rlqnt 
hand,  sines.  In  more  complex  cases,  It  win  oe  necessary  to 
compare  left  nand  sides  also  to  determine  tne  effect  of  various 
otner  factors  on  tne  operations  of  the  two  co"‘mands, 

One  of  tne  -Tiost  com.non  situations  in  whlcn  users  ask  questions 
Is  wnen  they  nave  qotten  some  wina  of  error  message  from  the 
system.  lalklng  about  such  errors  Is  easy  tor  a  rule  oaseo 
system,  ine  rules  describe  all  the  things  tne  system  can  do  and 
the  situations  in  wnlcn  It  win  do  them.  Errors  ao  not  need  to 
be  represented  exollcltly.  They  are  Implied  oy  tne  absence  of 
rules.  If  the  user  wrote  a  conunand  X  and  there  are  no  rules  tor 
command  X  whose  otner  preconditions  were  satisfied  at  the  time 
the  command  occurred  tnen  an  error  will  arise,  rne  system  can 
explain  the  errer  py  comparlnq  tne  existing  state  to  the  reaulred 
orecon iltl or.s  and  reporting  the  differences,  Ihls  Is  extremely 


103 


since  tor  a  complex  svste'i.  the  nairoer  ot  possible  error 
conf Iquratlons  can  oe  very  large  and  it  would  oe  very  dltficult 
to  t  dve  to  nescT  l.ne  each  ot  tnem  explicitly, 

A  0000  help  system  must  tailor  Its  responses  to  the  needs  of 

i 

InolviiHial  users,  iii  this  it  Is  no  different  from  other 

I 

Interactive  svstei's  l^<ich  79J,  une  '•ay  to  represent  a  model  of  ! 

a  Srrine  user  '•(oulrt  be  as  a  set  of  rules,  presumanly  a  suoset,  i 

oosslblv  wltn  errors,  of  the  rules  that  tne  system  knows,  with  ’ 

such  a  model,  some  question  *ouia  oe  very  easy  to  answer.  For 
exan.nie,  why  questions  couin  ne  answered  oy  comoarlng  tne  user's 
rules  anainst  ti'e  syste'^.'s  correct  rules  to  find  the  difference 
and  retort  it.  Inis  tec^'-nique  was  suggested  oy  hurton  and  brown 
thurton  7u]  as  a  way  an  intelllqent  CAi  system  coula  discover 
Duqs  in  a  student's  knowledge,  uut  it  is  unreasonaole  tor  a  help 
systof..  to  -.tiaintain  sucn  a  massive  amount  ot  Inforniation  apout 
each  user,  Insteao,  we  prooose  to  record  a  very  small  numoer  of 
facts  ai-out  each  user,  sucn  as  a  measure  of  his  expertise  with 
tne  syste-f,  fe.ach  ot  tne  objects  used  In  tne  system  will  have 
asscciateo  with  It  soifie  properties,  some  of  which  can  ce  matched 
aqalnst  user  characteristics  to  determine  tne  appropriate  rules 
to  use  In  Qen._'ratlnu  a  response  to  tne  question.  So,  for 
example,  commands  win  oe  irarxea  as  simple,  Intermediate,  or 
advanced,  utnor  factors  tnat  should  be  Included  In  the  model  of 
each  user  are  nls  Inclination  toward  oeinq  a  hacxer  ti,e,  ooes  he 
want  to  lean  fancy  new  commands  or  does  ne  want  to  know  a  v/ay  to 
get  ov  with  tne  co  ^i-nands  ne  knows/)  and  nls  familiarity  wltn 
cortputer  science  concents  (such  as  olocK  structure,  one  pass 
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system,  syi'ool  facies). 


One  of  f.ne  I'^^aior  anvantages  of  this  rule  based  representation 
of  tne  Kno-^ledqe  reiulred  oy  an  inteillaent  helper  is  that  it 
mirrors  the  structure  of  the  system  for  ><0100  the  help  is  belnq 
provided,  (Ur  at  least  It  does  if  the  system  is  i»eli 
structured,)  vnis  sucioests  that  toe  top  do^tn  process  of  vurltinq 
the  rules  could  oe  used  to  oroouce  a  well  structured  program  ana 
Its  f.ejo  svstei-  simultaneously,  ne  would  ll*ce  eventually  to  try 
to  build  an  entire  system  tnls  way. 
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Models  for  the  Design 
of  Static,  Software  Systems 


M.L.  Schneider 
Sperry  Univac 
Blue  Bell,  Pa  19424 


1.  INTRODUCTION 

One  of  the  "axioms”  for  ease-of-use  is:  "Help  systems  are  necessary" 
(Clark  1980) .  While  an  increasing  number  of  of  software  systems 
provide  some  form  of  user  assistance  (Relies  1979) ,  the  information  is 
usually  provided  without  regard  to  its  useage.  In  general,  assistance 
is  nothing  more  than  an  "electric  reference  manual." 

When  factoring  exists,  it  usually  consists  of  a  layered  approach;  the 
user  can  request  additional  details  about  a  specific  topic.  This 
addresses  the  problem  of  verbosity,  but  only  indirectly  considers  the 
expertise  level  of  the  requester. 

This  paper  proposes  cognitive  factors  that  may  impact  information 
factoring:  different  levels  of  user  sophistication  (the  User  Taxonomy) 
and  different  segments  of  task  performance  (the  Transaction  Taxonomy). 
The  interaction  between  these  two  taxonomies  can  provide  guidelines 
for  improved  static  information  factoring  in  assistance  systems. 

2.  USER  SOPHISTICATION  TAXONOMY 

The  developmental  levels  of  computer  language  acquisition  defined  in 
this  taxonomy  are 

1.  Parrot 

2.  Novice 

3.  Intermediate 

4.  Advanced 

5.  Expert 

Each  level  is  characterized  by  skills  in  language  production:  item, 
field,  or  statement  chunking;  breadth  of  language  scope;  and  degree  of 
generalization  or  abstraction  of  concepts.  The  change  in  system 
knowledge  is  manifested  through  an  increased  competence  in  the 
commands  that  are  regularly  used  and  an  awareness  of  additional 
functions  available  within  the  system  or  language. 
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The  basis  for  this  taxonomy  arises  from  qualitative  observatior.s  of 
computer  usage  in  a  wide  variety  of  software  systems  and  the 
relationship  between  the  observed  computer  productions  to  those 
observed  in  the  natural  language  development.  This  taxonomy  describes 
an  individual's  expertise  or  sophistication  in  a  single  software 
system  or  language  (or  subset  thereof)  and  may  not  be  transferable. 

The  level  at  which  an  individual  stops  progressing  appears  to  depend 
upon  a  number  of  factors  related  to  the  learning  of  complex  tasks  and 
the  demands  placed  upon  the  person  by  the  task  requirements. 


2.1.  THE  PARROT 

An  individual  at  the  lowest  level  in  the  taxonomy,  the  Parrot,  has 
minimal  knowledge  of  the  computer  system.  The  Parrot  approaches  the 
computer  system  and  types  commands.  This  individual  does  not  think, 
question,  understand,  or  synthesize  the  commands.  These  commands,  or 
sequence  of  commands  in  some  cases,  may  be  moderar.ely  complex. 
Satisfaction  is  derived  simply  by  having  the  computer  perform  the 
task. 

When  the  question  "What  am  T  doing?"  is  asked,  the  Parrot  is  ready  to 
progress  to  the  next  stage  of  sophistication:  the  Novice. 


2.2.  THE  NOVICE 

With  experience,  a  user  begins  to  understand  several  isolated  concepts 
and  is  able  to  choose  a  specific  lexical  entry  (command)  for  a 
function.  The  user  is  required  to  know  specific  but  not  complex 
information.  Semantically,  the  items  are  considered  in  the  concrete, 
not  in  the  abstract.  The  Novice  may  ask,  "What  does  this  command  item 
do?"  not  "What  can  it  do?"  By  now,  the  user  has  a  minimal  command  of 
the  grammar,  but  is  only  able  to  operate  on  an  item-by-item  basis. 

For  example,  the  Novice  may  recognize  a  verb  and  one  or  more  objects 
in  a  command,  even  if  the  grammar  allows  modifiers  in  the  verb  phrase 
or  in  the  object  phrase. 

Unlike  the  Parrot,  the  Novice  analyzes  each  item,  thus  extracting 
lexical  information.  The  language  components  now  have  meaning  and  can 
be  used  in  a  flexible  manner. 


2.3.  THE  INTERMEDIATE 

The  Intermediate  is  a  level  between  the  Novice  and  the  Advanced  user. 
Whereas  the  Novice  concentrates  on  items  in  isolation,  the 
Intermediate  operates  with  items  in  fields  and  with  fields  in 
statements.  A  statement  now  becomes  the  primitive  conceptual  unit. 
The  use  of  a  larger  chunk  encourages  syntactic  and  semantic 
conciseness  in  the  grammar,  allowing  the  user  to  minimize  keystrokes. 
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At  times,  the  Intermediate  user  may  link  statements  into  command 
"chains"  such  as  compile. . .collect .. .execute.  Even  so,  each  command 
is  still  considered  it.  isolation.  The  user  generally  waits  until  a 
function  has  been  completed  before  proceeding  to  the  next  request, 
wishing  to  see  the  result  of  a  command  before  continuing  with  the 
task. 

The  Intermediate  begins  to  concentrate  on  the  task  rather  than  its 
components.  Use  of  the  full  language  may  be  restricted  by  a  lack  of 
knowledge.  Thus,  the  Intermediate  continues  to  expend  significant 
effort  on  language  details.  At  this  point  in  the  user's  development, 
the  more  subtle  grammatical  rules  become  evident.  A  Novice  would  use 
a  default,  unaware  of  the  fact  that  an  item  can  be  specified.  An 
Intermediate  would  consciously  use  a  default  in  order  to  reduce 
keystrokes  or  save  time.  Initially  the  Intermediate  uses  knowledge  in 
a  specific  problem  domain.  Later,  this  information  is  generalized, 
allowing  new  problems  to  be  solved. 

Toward  the  end  of  the  Intermediate  level,  considerable  skill  in  the 
understanding  and  manipulation  of  a  segment  of  the  command  set  has 
bean  achieved.  With  the  increased  use  of  larger  syntactic  chunks  each 
requires  less  attention.  This  is  the  procas:  of  automatization. 

Thus,  increased  attention  can  be  given  to  the  entire  task,  rather  than 
to  the  mechanisms  required  for  its  performance. 

With  further  experience  and  increased'  task  requirements,  the 
Intermediate  can  evolve  into  an  Advan -ed  user,  subordinating  the 
computer  language  to  the  task. 


2.4.  THE  ADVANCED  USER 

Whereas  the  Intermediate  attempts  to  solve  problems  via  a  series  of 
isolated  commands,  the  Advanced  user  realizes  that  an  interconnected 
col-lection  of  statements  can  be  more  productive  for  certain  tasks.  At 
this  level  a  program  or  procedure,  rather  than  a  single  statement, 
results.  Because  commands  are  now  interrelated,  the  scope  of  the 
syntax  and  semantics  expands.  The  syntactic  elements  are  abstract 
rather  than  concrete.  Data  structures  provide  the  vehicle  for 
producing  abstract  objects.  For  example,  a  variable  would  be  used  to 
represent  a  filename  or  a  string.  The  Advanced  user  continues  to 
retain  the  command,  together  with  other  defined  procedures,  as 
language  primitives. 

Control  structures  are  useful  if  the  direction  of  flow  between 
statements  is  to  be  modified.  Using  these  structures  requires  a 
modification  of  the  user's  thought  process.  A  Novice  or  Intermediate 
user  may  not  foresee  the  success  or  failure  status  of  a  command  as  an 
object  on  which  operations  are  defined.  An  Advanced  user  thinks  about 
the  possible  outcomes  of  commands  and  has  the  ability  to  take 
appropriate  action.  While  'Novice  and  Intermediate  users  operate  with 
concrete  syntactic  constructions,  existing  with  in  a  specific, 
restricted  semantic  scope,  the  Advanced  user  expands  his  language 
knowledge  to  cope  with  complex  structures  and  abstractions. 
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Practically  speaking,  the  Advanced  user  has  the  ability  (though  not 
necessarily  the  need)  to  accomplish  any  function  within  the  system. 
The  Advanced  user  is  completely  facile  with  the  language  and  can  deal 
with  the  language  at  the  global  "metalinguistic"  level. 


2.5.  THE  EXPERT 

The  Advanced  user  has  the  ability  to  use  the  language  with  relative 
ease.  Since  any  computer  language  is  restricted  in  scope,  it  can 
limit  a  user  (fc  oxample,  the  inability  to  have  abstract  data  types 
in  FORTRAN  77).  The  Advanced  user,  knowing  the  scope  of  the  language, 
is  constrained  when  faced  with  a  new  problem  whose  solution  cannot  be 
derived  from  existing  functions  or  ob.^ects  within  the  system.  The 
Expert  transforms  this  finite  system  into  a  generative  one.  When 
faced  with  the  above  situation,  he  creates,  not  derives,  a  new 
syntactic  element  within  the  system.  Thus  the  Expert  expands  the 
existing  system,  creating  new  objects  and  functions. 


3.  TRANSACTION  TAXONOMY 

While  the  sophistication  level  of  the  user  is  important,  it  is 
necessary  to  know  how  a  transaction  is  processed  in  order  to  acquire 
additional  assistance  information .  A  transaction  is  dorfined  as  t’no 
task  contemplated  by  the  user  (For  example;  writing  a  program, 
"checking-in"  an  airline  passenger,  or  performin'  a  data  base  query) . 

The  five  stage  transaction  taxonomy  shown  belov#  builds  upon  a  simple 
taxonomy  (command  nd  data  input,  processing,  and  system  output)  by 
expanding  the  first-  operation,  input,  into  its  semantic  and  syntactic 
components  as  suggested  by  Shneiderman  (Shneiderman  1979) . 


STAGE 

ACTION 

I 

Task  Analysis 

TI 

Semantic  Analysis 

III 

Syntactic  Analysis 

IV 

System  Performance 

V 

Response  Analysis 

3.1.  STAGE 

I  —  TASK  ANALYSIS 

In  the  first  stage  tlie  user  decomposes  a  single  conceptual  task  into 
its  component  subtasks  and  determines  the  specific  commands  required  , 
for  task  completion.  The  user  asks  the  question,  "What  steps  and 
commands  are  necessary  to  perform  the  overall  task?"  For  example, 
running  a  program  (the  single  conceptual  task)  may  require  the 
following  subtasks:  editing,  compilation,  collection,  and  execution. 

It  is  possible  that  more  than  one  step  can  be  included  within  a  single 
command  (for  example  a  compile-load-go)  or  more  than  one  subtask  is 
required  within  each  subtask  (for  example  operations  with  the  editor). 
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The  cognitive  processes  <it  this  stnge  may  Include  all  or  some  of  tlia 
following  steps: 

1.  Identification  of  the  full  task. 

2.  Decomposition  of  the  task  into  its  subtasks  or  steps. 

3.  Definition  of  the  conceptual  operation  for  each  step. 

4.  Choice  of  the  appropriate  command  for  the  implementation  of 
each  step. 

It  should  not  be  assumed  that  all  commands  will  be  chosen  at  the 
outset.  It  is  highly  probable  that  an  individual  will  determine  the 
conceptual  operation  for  the  first  subproblem,  choose  an  appropriate 
command,  perform  it,  assess  the  result,  then  progress  to  the  next 
conceptual  operation,  the  choice  of  which  may  be  influenced  by  the 
result  of  a  previous  task. 

Once  the  conceptual  operation  has  been  defined,  a  user  may  wish  to 
examine  the  set  of  commands  for  its  implementation.  It  is  possible  to 
relate  commands  and  conceptual  operations  in  two  ways:  define  a 
conceptual  operation  for  commands  that  are  conceptually  related,  or 
its  antithesis,  to  extract  from  a  conceptual  operation  its  constituent 
commands.  By  iterating  between  these  perspectives,  it  sliould  be 
possible  for  the  user  ''.o  determine  a  command  that  allows  the 
conceptual  operation  t'jo  be  performed. 

A  command  subset  of  a  hypothetical  editor  illustrates  this  iterative 
approach.  Consider  the  command  "LOCATE"  (this  searches  the  text 
printing  the  lines  whenever  a  string  occurs).  The  specific  to  general 
relationship  would  be; 

"LOCATE" - >sgarch 

print 

The  general  concept  print  may  refer  to  a  number  of  commands  that,  if 
successfully  executed,  print  a  line: 

print - >"PRTNT" 

"LOCATE" 

"FIND" 

"GOTO" 

"NEXT"  ■ 

If  all  commands  of  the  concept  search  print  a  line,  then  the  structure 
could  be  represented  as: 

print - >"PRINT" 

"GOTO" 

"NEXT" 

search - >"  LOCATE" 

"FIND" 
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A  similar  grouping  can  occur  for  "GOTO"  and  "NEXT". 

When  explanations  are  provided  (basic  semantic  information)  within  the 
above  framework,  the  user  can  obtain  the  information  in  a  unified 
manner . 


3.2.  STAGE  ri  —  SEMANTIC  ANALYSIS 

In  the  second  stage,  the  scope  of  the  command  is  considered  by  the 
user.  Upon  entry  to  the  semantic  analysis,  the  command  is  conceptual 
in  the  broadest  sense.  Now  it  must  be  refined  into  its  detailed 
semantic  components. 

The  question;  "What  do  I  want  to  do?"  is  asked  by  the  user.  The  user 
must  be  cognizant  of  two  semantic  concepts:  definition  of  the  data  and 
the  control  of  the  process.  A  sorting  program  illustrates  the  type  of 
information  considered  by  the  user.  A  user  must  be  aware  of  the  data 
restrictions  (eg.  numerics  only,  alphanumerics,  maximum  number  of 
items,  maximum  number  of  fields,  etc.)  and  the  method(s)  of  data 
storage  or  entry.  In  addition,  information  is  required  to  control  the 
processing  (ascending,  descending,  key(s) ,  collating  sequence,  etc.). 
At  the  semantic  stage,  it  is  unnecessary  to  know  how  to  encode  this 
information. 


3.3.  STAGE  III  —  SYNTACTIC  ANALYSIS 

V/hen  a  user  reaches  the  third  stage,  ^'coding  the  information,  the 
correct  function  has  been  chosen  and  ti‘e  semantics  for  task  completion 
are  understood.  Now  the  question  is,  "How  do  I  do  it?"  The 
translation  of  the  conceptual  operation  into  the  input  format  is 
purely  mechanical.  The  user  requires  syntactic  information  and 
techniques  that  facilitate  this  transformation.  The  form  of  the 
human-computer  interface  (command  language,  dialogues,  menus,  function 
keys,  etc.)  has  a  primary  impact  at  this  stage. 

3.4.  STAGE  IV  —  SYSTEM  PERFORMANCE 

System  response,  the  fourth  stage,  can  be  treated  as  a  "black  box". 

The  underlying  architecture  that  supports  the  interface  is  outside  the 
scope  of  this  paper. 


3.5.  STAGE  V  —  RESPONSE  ANALYSIS 

The  analysis  and  interpretation  of  the  response  produced  by  the 
software  is  the  final  stage  of  a  transaction.  The  user  now  asks, 

"What  have  I  done?"  The  primary  goal  of  a  response  is  to  provide  the 
user  with  relevant  information.  Unnecessary  details  that  obscure  this 
information  should  be  avoided.  Two  independent  topics  should  be 
considered;  verbosity  and  information  content  (Schneider  1930). 
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For  example,  if  the  task  is  to  assign  the  file,  MYFILE,  there  are  a 
number  of  possible  responses  if  it  is  successful  (ordered  by 
increasing  verbosity  and  content): 

1.  >  {a  prompt  for  the  next  command) 

2.  READY  {,  OK,  COMPLETE,...) 

3.  File  MYFILE  has  been  assigned. 

4.  MYFILE  assigned  with  the  PUBLIC,  and  CATALOG  options. 

5.  File  MYFILE  has  been  assigned.  It  can  be  used  by  anyone 
(PUBLIC)  and  will  exist  for  one  day  (CATALOGUED)  unless 
otherwise  requested.  To  keep  the  file  longer  than  one  day 
contact  the  file  administrator . 

The  last  response  is  an  example  of  layering.  Three  items  of 
information  have  been  displayed: 

1.  The  name  of  the  assigned  file 

2.  The  file  attributes 

3.  The  administrati  \/o  procedure  required  to  keep  the  file. 

In  a  similar  manner,  it  is  possible  to  design  a  layered  dELP  function 
(a  user  initiated  request  for  assistance) . 

A  corn^nd  may  not  •a‘''.vays  ‘■.armina*'?  successfully.  'Js=fu'' 

meaningful  error  messages  are  important.  Good  error  reporting  shoul'- 

provide  sufficient  information  for  the  user  to: 

1.  Understand  the  nature  of  the  error; 

Understand  the  source  of  the  error; 

3.  Understand  the  methods  for  recovery  or  correction. 

Again  the  questions  of  verbosity  and  information  content  are 
important.  Verbosity  may  be  correlated  with  th.e  rrumbor  of  times  an 
individual  has  seen  the  message,  while  information  content  should  be 
related  to  the  levels  in  the  user  taxonomy  and  task  requirements  . 

4.  INTERACTIONS  BETWEEN  TAXONOMIES 

The  user  and  transaction  taxonomies  should  not  be  considered  in 
isolation.  Based  upon  the  sophistication  level  of  the  user,  the  scop-' 
of  assistance  may  vary.  Different  segments  of  the  transaction 
taxonomy  need  to  bo  emphasized  or  deemphasizod .  The  method  of 
assistance  presentation  provided  to  individuals  at  different 
sophistication  levels  for  the  same  transaction  may  differ.  For 
example : 
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C:  FILE  MYFILE  HAS  BEEN  ASSIGNED  - 

u:  attributes 
C:  PUBLIC  CATALOGED 
u:  physical 

C:  SIZE  -  12  TRACKS.  LOCATED  ON  D2734.  UNFORMATTED 

In  order  to  better  understand  the  type  of  assistance  applicable  at 
each  level  of  use,  it  is  necessary  to  examine  the  requirements  of 
users  at  each  sophistication  level. 


4,1.  PARROT 

A  Parrot  operates  in  a  simple  "transcription  mode."  There  is  no 
consideration  of  input  variability.  The  best  form  of  input  assistance 
IS  an  example  or  a  single  choice  from  a  single  level  menu  system.  The 
latter  is  analogous  to  function  keys.  By  careful  design,  eiciisr  of 
these  approaches  can  be  extended  to  assistance  forms  suitable  for  a 
Novice. 

Only  two  basic  responses  can  exist  for  the  Parrot:  the  function 
completed  successfully,  or  it  was  unsuccessful.  If  an  unsuccessful 
response  is  provided,  it  can  only  state  that  the  command  was 
incorrectly  entered  and  should  bo  entered  again  (a  Parrot  does  not 
comprehend  the  command's  contents).  If  the  system  is  unable  to 
perform  the  task  at  this  time,  it  can  be  suggested  that  the  user  try 
l-^tcr.  ninco  task  compl'^tion  is  the  reward  for  successful  conne-ii 
entr/,  this  infermstijn  should  always  be  provid  -d  to  the  usvr. 

Thus  at  the  Parrot  level  there  is  only  one  type  of  input  assistance: 
an  example. 


4.2.  NOVICE 

The  Novice  may  not  distinguish  between  the  first  three  stages  of  a 
transaction  (Task,  Semantic,  and  Syntactic  Analysis!.  Thus,  these 
stages  should  not  be  differentiated  if  the  user's  perspective  is  to  be 
reflected  in  the  interface.  The  system  should  lead  the  user  from  the 
determination  of  the  subtask(s),  through  the  isolation  of  the  correct 
command  and  tiie  determination  of  its  semantic  components,  to  the 
encoding  of  the  information. 

Once  t!ie  user  is  ready  to  provide  data  for  the  command,  a  number  of 
techniques  can  be  applied.  As  stated  earlier,  continuity  betwceen  che 
first  three  stages  is  important;  the  user  should  be  unav/aro  of  any 
distinct  phase  of  the  transaction.  Since  the  traditional  command 
format  may  be  inapplicable  to  the  Novice,  menus  could  be  used  for 
stages  I  and  II  followed  by  a  mixture  of  menus,  dialogs,  and 
" f i 1 1- in-the-blanks"  for  stage  III. 
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This  expands  the_syntactic  assistance  to  two  lev.els; 


Assistance 

Type 

Example 
Simplest  Form 


Sophistication 

Level 

Parrot 

Novice 


Irrespective  of  the  technique,  the  computer  should  take  the 
initiative;  the  Novice  nay  not  know  what  information  is  required,  or 
even  if  it  is  available.  Thus,  it  is  incumbent  upon  the  assistance 
system  to  announce  its  existence.  Information  for  clarification, 
however,  should  be  provided  only  upon  demand.  To  do  so  automatically, 
may  unnecessarily  confuse  or  annoy  the  user. 

Responses,  aside  from  providing  information  to  the  user,  should 
indicate  the  successful  completion  of  the  command  in  a  non-null  form 
(something  more  than  a  prompt) .  A  Novice,  lacking  confidence  in  the 
ability  to  control  the  system,  may  require  this  positive 
reinforcement. 


2.1.  INTERMEDIATE 


Because  the  Intermediate  is  familiar  with  the  system,  the  user,  not 
the  computer,  should  take  the  initiative.  An  individual  at  this 
sophistication  level  has  the  ability  to  decompose  a  task  into  its 
ijubtasks  ana  determine  an  appropriate  command  (Staoc  I).  Since  che 
components  of  the  system  are  known  to  exist,  even  if  not  understood, 
information  should  be  factored  into  the  following  topics;  command 
semantics,  command  syntax,  and  field  or  keyword  semantics  and  synt'x. 
Since  individuals  generally  employ  a  subset  of  commands  (Huckie  19^0), 
assistance  is  still  required  for  those  used  less  commonly. 


Assistance  in  the  semantic  and  syntax  analyses  (Stages  i:  and  HI) 
require  additional  information.  As  a  user  gains  experience  with  a 
command,  defaults  are  better  understood,  overridden,  or  modified. 

Thus,  the  scope  of  the  command  perceived  by  the  user  is  extended.  The 
semantic  and  syntactic  expansion  of  commands  requires  that  two  new 
levels  of  assistance  must  be  added: 


1.  The  most  common  form  of  the  command.  This  will  occur  when 
some  commonly  defaulted  items  are  overridden. 


2. 

Thus,  the 


The  command  is  used  in  its  full  form.  This  occurs  when  no 
item  is  defaulted. 

number  of  levels  are  increased  to  four; 


A.ssi  stance 
Type 

Example 
Simplest  Form 
common  Form 
Full  Concrete  Form 


Sophistication 

Level 

Parrot 

Novice 

Intermediate 

Intermediate 
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When  the.  senipinticsjand  syntax  of  a  command  are  not  complicated,  two  or 
even  one  of  the  above  forms  may  fulfill  the  information  requirements. 

Because  the  Intermediate  operates  in  a  terse  mode,  abbreviated  forms 
of  the  command  should  be  provided.  This  includes,  not  only  contracted 
forms  of  the  strings  within  the  command  (name,  keywords,  flags,  etc.) , 
but  the  items  that  can  be  defaulted  and  the  values  supplied. 

The  layered  approach  for  responses  should  be  available.  As  in  case  of 
information  required  for  the  input  of  a  command,  the  user  should  be 
able  to  request  specific  information.  The  advantages  (terseness  and 
specificity)  of  requesting  specific  information  is  offset  by  the  need 
for  a  query  language. 


2.2.  ADVANCED 

The  needs  of  the  Advanced  user  differ  from  the  Intermediate  in  three 
ways. 

1.  The  transaction  stages  considered  prior  to  entering  a 
command  require  a  different  emphasis  because  data  and 
control  structures  are  now  a  part  of  the  user's  command 
repertoire. 

2.  There  is  a  need  for  assistance  in  the  monitoring  of  an 
executing  command  since  they  are  executed  in  a  "batch 
env i ronment"  . 

3.  A  different  type  of  response  structure  is  needed  since  it 
must  be  interpreted  dit.--ctly  by  i  command  within  the 
software  without  human  intervention. 

Within  the  first  two  stages,  an  increase  in  the  type  of  information 
exists,  reflecting  the  added  control  and  data  structures  employed  by 
the  ‘d-'on'^ed  user.  These  new  structures  may  be  implemented  within  an 
existing  command  or  via  new  commands.  Assiscar.ce  and  instruction  in 
the  methods  of  building  macios,  procedures  and  programs  are  useful  for 
the  Advanced  user.  These  new  functional  elements  are  reflected  not 
only  in  Stages  II  and  £11,  but  their  concepts  must  be  included  in 
Stage  t. 

Control  and  data  structures  are  now  used  in  the  development  of 
procedures.  This  places  additional  demands  upon  the  response  segment. 
Whereas  in  the  lower  sophistication  level  interfaces,  the  responses 
must  be  understood  by  a  human,  in  a  procedures,  responses  must  be 
understood  by  the  software. 

The  abstract  nature  of  the  command  requires  additional  syntactic 
information.  When  a  command  has  constructs  that  relate  only  to  these 
structures,  they  must  exist  only  in  the  information  supplied  to  the 
Advanced  user.  Thus,  in  addition  to  the  three  assistance  levels 
applicable  to  the  Novice  and  Intermediate  users,  a  fourth  level. 
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containing  the  expanded  language  view  must  be  included.  The  five 
levels  of  assistance  are  shown  below: 


Assistance 

Type 

Example 
Simplest  Form 
Common  Form 
Full  Concrete  Form 
Full  Abstract  Form 


Sophistication 

Level 

Parrot 

Novice 

Intermediate 

Intermediate 

Advanced 


3.  CONCLUSION 


On  a  theoretical  basis,  it  is  possible  to  factor  software  user 
assistance  information  into  three  independent  categories: 

1.  verbosity 

2.  user  sophistication 

3.  task  segmentation 


Although  it  is  possible  to  prepare  guidelines  fo 
classification  of  information  within  each  catego 
investigations  will  validate  these  suppositions, 
studies  of  specific  topics  are  in  progress. 


r  the  further 
ry,  only  experimental 
At  the  present  time 
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SYSTEM  MESSAGE  GUIDELINES: 


POSITIVE  TONE,  CONSTRUCTIVE,  SPECIFIC,  AND  USER  CENTERED 


Ben  Shneiderman 
University  of  Maryland 
Department  of  Computer  Science 
College  Park,  MD  20742 
January  27,  1981 


***  Draft  paper  prepared  for  Workshop  on  Human  Factors  in 
Interactive  Systems,  Georgia  Institute  of  Technology,  March 
26-27,  1981,  Atlanta,  Georgia. 


Prompts,  explanations,  error  diagnostics,  and  warnings  play  a 
critical  role  in  influencing  user  acceptance  of  software  systems. 
Programming  and  command  languages  and  application  systems  are 
appreciated  not  only  for  the  functionality  they  offer  but  for  the 
phrasing  of  system  messages  in  a  specific  implementation.  This 
is  true  for  batch  systems,  but  it  is  more  important  for 
interactive  systems  in  which  the  impact  of  a  message  is  immediate 
and  more  dramatic. 


The  wording  of  prompts,  advisory  messages,  and  system  responses 
to  commands  may  influence  user  perceptions,  but  the  phrasing  of 
diagnostic  messages  or  warnings  about  improper  conditions  is 
critical.  Since  errors  occur  because  of  lack  of  knowledge, 
incorrect  understanding  or  inadvertent  slips,  the  user  is  likely 
to  be  confused,  feel  inadedquate,  and  be  anxious.  Messages  with 
an  imperious  tone,  which  condemn  the  user  for  an  error,  can 
heighten  user  anxiety,  making  it  more  difficult  to  correct  the 
error  and  increasing  the  chances  for  further  errors.  Messages 
which  are  too  generic,  such  as  the  ubiquitous  "SYNTAX  ERROR", 
obscure  "FAC  RJCT  004004400400",  or  mystical  "0C7"  offer  little 
assistance  to  the  novice  user. 


These  concerns  are  especially  important  with  respect  to  the 
novice  user  whose  lack  of  knowledge  and  confidence  amplify  the 
stress  related  feedback  which  can  lead  to  a  sequence  of  failures. 
The  discouraging  effects  of  a  bad  experience  in  using  a  computer 
are  not  easily  overcome  by  a  few  good  experiences.  In  fact,  I 
suspect  that  systems  are  remembered  more  for  what  happens  when 
things  go  wrong  than  when  things  go  right.  Although  these 
effects  are  most  prominent  with  novice  computer  users, 
experienced  users  also  suffer.  Experts  in  one  system  or  part  of 
a  system  are  still  novices  for  many  situations. 
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^wareness  of  the  dif ^ that  novices  encounter  has  prompted 
the  development  of  student-oriented  compilers  for  some  languages, 
which  emphasize  good  diagnostic  messages  and  even  limited  error 
correction.  The  early  DITR^N  effort  (Moulton  and  Muller,  1967) 
and  CORC  (Freeman,  1964)  were  followed  by  the  WATFOR/W^TFIV 
compilers  (Cress,  Dirksen  and  Graham,  1976)  and  the  PL/C  compiler 
(Conway  and  Wilcox,  1973).  These  efforts  demonstrate  what  can  be 
accomplished  if  the  developers  are  sincere  about  their  concern 
for  ease  of  use.  PL/C  and  WATFIV  are  widely  used  in  academic 
environments  not  only  because  of  their  diagnostic  messages  but 
also  because  of  their  rapid  compilation  speeds.  These  systems 
demonstrate  that  although  there  may  be  a  greater  development  cost 
for  good  diagnostics,  the  production  costs  can  be  kept  low. 
Although  I  am  not  aware  of  any  controlled  experimental  research 
which  proves  that  students  using  these  compilers  learn  faster, 
make  fewer  errors  or  have  a  more  positive  attitude  toward 
computers,  these  hypotheses  are  shared  by  many  people.  Rigorous 
human  factors  studies  would  be  useful  in  evaluating  the 
improvement  brought  about  by  these  systems  and  would  be  helpful 
in  convincing  skeptics  about  the  importance  of  designing  good 
system  messages. 


Producing  a  set  of  guidelines  for  writing  system  messages  is  not 
an  easy  task  because  of  differences  of  opinion  and  the 
impossibility  of  being  complete.  Inspite  of  these  dangers,  I 
feel  that  producing  such  guidelines  could  yield  better  systems. 
Input  parsing  strategies,  message  generation  techniques,  and 
message  phrasing  can  be  changed  without  affecting  system 
functionality.  Hopefully,  more  attention  to  system  messages  will 
lead  to  instrumentation  of  systems  to  capture  data  on  error 
frequency  distributions.  Such  data  will  enable  system  designers 
and  maintainers  to  revise  error  handling  procedures,  improve 
documentation  and  training  manuals,  alter  instructional 
materials,  or  even  change  the  programming  or  command  language 
syntax.  Focusing  increased  attention  on  system  messages  should 
compel  system  developers  to  include  the  complete  set  of  messages 
in  user  manuals.  This  high  visibility  will  produce  even  more 
concern  for  the  quality  of  these  messages. 


These  comments  are  the  result  of  experience  and  subjective 
evaluation.  Controlled  psychologically-oriented  experimentation 
would  be  useful  in  verifying  these  conjectures. 


BE  SPECIFIC 

Messages  which  are  too  general  make  it  difficult  for  the  user  to 
know  what  has  gone  wrong.  The  simple  minded  and  condemning 
messages  such  as  "SYNTAX  ERROR"  or  "ILLEGAL  ENTRY",  or  "INVALID 
DATA"  are  frustrating  because  they  do  not  provide  enough 
information  about  what  has  gone  wrong.  Improved  versions  might  be 
"Unmatched  left  parenthesis",  "Legal  commands  are:  Send,  Read, 
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File,  or  Drop",  or  "Days  must  be  in  the  range  of  1  to  31 


11 


Even  in  widely  appreciated  systems  like  WATFIV  there  is  room  for 
improvement.  Messages  such  as  "INVALID  TYPE  OF  ARGUMENT  : U 
REFERENCE  TO  A  SUBPROGRAM"  or  "WRONG  NUMBER  OF  ARGUMENTS  IN  A 
REFERENCE  TO  A  SUBPROGRAM"  might  be  improved  if  the  name  of  the 
subprogram  were  included  and  the  correct  type  or  number  of 
arguments  were  provided.  The  APL  system  which  has  so  many  nice 
human  factor s-or tented  features  comes  out  poorly  when  evaluated 
for  system  messages.  The  extremely  brief  "SIZE  ERROR",  "RANK 
ERROR",  or  "DOMAIN  ERROR"  comments  are  too  cryptic  for  novices 
and  fail  to  provide  information  about  which  variables  are 
involved.  On  the  plus  side,  the  standardization  (most  systems 
use  the  APL360  messages)  of  messages  does  make  it  easier  for 
users  to  move  from  one  system  to  another.  I  have  long  felt  that 
language  standardization  efforts  should  include  standardization 
of  at  least  the  fundamental  messages. 


Execution  time  messages  in  programming  languages  should  provide 
the  user  with  specific  information  about  whore  the  problem  arose, 
what  variables  are  involved  and  what  values  were  improper.  When 
division  by  zero  occurs  some  processors  will  terminate  with  a 
crude  message  such  as  "DOMAIN  ERROR"  in  APL  or  "SIZE  ERROR"  in 
some  COBOL  compilers.  PASCAL  specifies  "division  by  zero"  but 
may  not  include  the  line  number  or  variables  that  the  PLUM 
compiler  offers  (Zelkowitz,  1976) .  Maintaining  symbol  table  and 
line  number  information  at  execution  time  so  that  better  messages 
can  be  generated  is  usually  well  worth  the  modest  resource 
expenditure. 


Systems  whicn  offer  a  code  number  for  error  messages  are  also 
annoying  because  the  manual  may  not  be  available  and  consulting 
it  is  disruptive  and  time  consuming.  In  most  cases,  system 
developers  can  no  longer  hide  behind  the  claim  that  printing 
complete  messages  consumes  too  many  system  resources. 


BE  CONSTRUCTIVE 


Rather  than  condemning 
where  possible  tell 
right.  Nasty  messages 
ABANDONED."  (from  a 


the  users  for  what  they  have  done  wrong, 
them  what  they  need  to  do  to  set  things 
such  as  "DISASTROUS  STRING  OVERFLOW.  JOB 
well-known  compiler-compiler) ,  "UNDEFINED 


LABELS",  or  "ILLEGAL  STA.  WRN."  (both  from  a  major  manufacturer's 
FORTRAN  compiler)  can  bo  replaced  by  more  constructive  phrases 

space  consumed.  Revise  program  to  use  shorter 
string  space.",  "Define  statement  labels  before 
I  statement  cannot  be  used  in  a  FUNCTION 


such  as  "String 
strings  or  expand 
use",  or  "RETURN 
subprogram" . 
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It  may  be  difficu  t  for  the  compiler  writer  to  write  code  which 
accurately  determines  what  the  user's  intention  was,  so  the 
advice  to  be  constructive  is  often  difficult  to  apply,  I  believe 
that  error  correcting  compilers  should  be  extremely  conservative 
for  the  same  reason.  Automatic  error  correction  has  the  danger 
that  users  will  fail  to  learn  proper  syntax,  and  become  dependent 
on  the  compiler  making  corrections  for  them.  For  interactive 
systems  the  user  can  be  consulted  before  corrections  are 
automatically  applied. 


BE  OSER-CENTERED 

By  user-centered  I  mean  that  the  user  controls  the  system  rather 
than  the  system  directs  the  user  what  to  do.  This  is  partially 
accomplished  by  avoiding  the  negative  and  condemning  tone  in 
messages  and  by  being  courteous  to  the  user.  If  the  system  will 
take  a  long  time  to  respond  to  a  command  then  the  user  should  bo 
informed  with  a  simple  estimate  of  the  time.  Prompting  messages 
should  avoid  the  imperative  forms  such  as  "ENTER  DATA"  and  focus 
on  user  control  such  as  "READY  FOR  COMMAND"  or  simply  "READY". 


Brevity  is  a  virtue,  but  the  user  should  be  allowed  to  control 
the  kind  of  information  provided.  Possibly  the  standard  system 
message  should  be  less  than  a  line,  but  by  keying  a  "?"  the  user 
should  be  able  to  get  a  few  lines  of  explanation.  Two  question 
marks  might  yield  a  set  of  examples  and  three  question  makks 
might  produce  explanations  of  the  examples  and  a  complete 
description.  The  CONFER  teleconferencing  system  provides 
appealing  assistance  similar  to  this.  The  PLATO  computer 
assisted  instruction  system  offers  a  special  HELP  button  and 
other  options  to  provide  explanations  when  the  student  needs 
assistance. 


The  designers  of  the  Library  of  Congress’  SCORPIO  system  (Woody 
et  al.,  1977)  for  bibliographic  retrieval  understood  the 
importance  of  making  the  users  feel  that  they  are  in  control.  In 
addition  to  using  the  properly  subservient  "READY  FOR  NEXT 
COMMAND"  the  designers  avoid  the  use  of  the  words  "error"  or 
"invalid"  in  the  text  of  system  messages.  Blame  is  never 
assigned  to  the  user  but  instead  the  system  displays  "SCORPIO 
COULD  NOT  INTERPRET  THE  FOURTH  PART  OF  THE  COMMAND  CONTENTS, 
WHICH  IS  SUPPOSED  TO  BE  A  4-CHARACTER  OPTION  CODE."  The  message 
then  goes  on  to  define  the  proper  format  and  present  an  example 
of  its  use. 


USE  M  APPROPRIATE  PHYSICAL  FORMAT 

Although  professional  programmers  have  learned  to  read  upper  case 
only  text,  most  novices  prefer  and  find  it  easier  to  read  upper 
and  lower  case  messages.  Messages  that  begin  with  a  lengthy  and 
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mysterious  code  number  only  serve  to  remind  the  user  that  the 
designers  were  insensitive  to  the  real  needs  of  users.  If  code 
numbers  are  needed  at  all  they  might  be  enclosed  in  parentheses 
at  the  end  of  a  message. 


There  is  some  disagreement  about  the  placement  of  messages  in 
program  listing.  One  school  of  thought  argues  that  the  messages 
should  be  placed  at  the  point  in  the  program  where  the  problem 
has  arisen.  The  second  opinion  is  that  the  messages  clutter  the 
listing  and  anyway  it  is  easier  for  the  compiler  writer  to  'place 
them  all  at  the  end.  This  is  a  good  subject  for  experimental 
study,  but  I  would  vote  for  placing  messages  in  the  body  of  the 
listing  assuming  that  a  blank  line  is  left  above  and  below  the 
message  so  as  to  minimize  interference  with  reading  the  listing. 
Of  course,  certain  messages  must  come  at  the  end  of  the  listing 
and  execution  time  messages  must  appear  in  the  output  listing. 


Some  application  systems  ring  a  bell  or  sound  a  tone  when  an 
error  has  occurred.  This  can  be  useful  if  the  error  could  be 
missed  by  the  operator,  but  it  is  extremely  embarrassing  if  other 
people  are  in  the  room  and  potentially  annoying  even  if  the 
operator  is  alone.  The  use  of  audio  signals  should  be  under  the 
control  of  the  operator. 


The  early  high  level  language,  MAD  (Michigan  Algorithmic  Decoder) 
printed  out  a  full  page  picture  of  Alfred  E.  Neuman  if  there  were 
syntactic  errors  in  the  program.  Novices  enjoyed  this  playful 
approach,  but  after  they  had  accumulated  a  drawer  full  of 
pictures,  the  portrait  became  an  annoying  embarrassment. 
Highlighting  errors  with  rows  of  asterisks  is  a  common  but 
questionable  approach.  Designers  must  walk  a  narrow  path  between 
calling  attention  to  a  problem  and  avoiding  embarrassment  to  the 
operator.  Considering  the  wide  range  of  experience  and 
temperment  in  users,  maybe  the  best  solution  is  to  offer  the  user 
a  choice  of  alternatives  -  this  coordinates  with  the 
user-centered  principle. 


2.  EXPERIMENTAL  RESULTS 


2.1  COBOL  Compiler  Messages 

A  pilot  study  was  run  to  explore  the  impact  of  improved  messages 
on  the  ability  of  programmers  to  locate  and  repair  bugs.  The 
experiment,  carried  out  by  Patrick  Peck  and  David  Fuselier  under 
the  direction  of  the  author,  was  administered  to  22  second  term 
COBOL  students  at  the  University  of  Maryland  in  Fall  1979. 


Five  bugs  were  included  in  a  132  line  COBOL  program  yielding  the 
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following  messages  from  a  UNIVAC  COBOL  compiler: 

1)  RESERVED  WORD  USED  AS  PARAGRAPH  OR  SECTION  NAME  IGNORE 
ATTEMPT  RECOVERY  HERE  AFTER  PREVIOUS  ERROR 

2)  DANGLING  ELSE  OR  WHEN;  TREATED  AS  AN  IMPERATIVE 

3)  UNDEFINED  DATA  ITEM  STATEMENT  OMITTED 
ATTEMPT  RECOVERY  HERE  AFTER  PREVIOUS  ERROR 
PREVIOUS  ERRORS  CAUSE  LOSS  OF  OBJECT  CODE 

4)  WORD  NOT  A  VERB;  SCAN  SKIPS  TO  NEXT  VERB 
ATTEMPT  RECOVERY  HERE  AFTER  PREVIOUS  ERROR 

5)  BLANK  MISSING  BEFORE  OPERATOR  OR  LEFT  PARENTHESIS 
BLANK  MISSING  AFTER  ARITH/COND  OPERATOR  OR  PUNCTUATION 

A  second  version  of  the  listing  was  produced  with  the  following 
five  improved  messages; 

1)  PERIOD  IN  PREVIOUS  LINE  CONTAINED  IN  IP  STATEMENT,  DELETE 

2)  EXTRANEOUS  ELSE  IN  PREVIOUS  LINE,  DELETE 

3)  BLANKS  IS  UNDEFINED  DATA  ITEM,  MUST  USE  SPACES 

4)  USE  AFTER  PAGE  INSTEAD  OF  AFTER  1  PAGE 

5)  SPACE  REQUIRED  BEFORE  OPERATOR 
SPACE  REQUIRED  AFTER  OPERATOR 

Code  numbers  and  severity  levels  were  eliminated  in  the  improved 
messages  and  a  single  blank  line  was  left  above  and  below  the 
improved  messages.  Eleven  copies  of  each  of  the  listings  were 
produced  and  randomly  distributed  to  the  subjects.  Seven  minutes 
were  allowed  to  locate  and  repair  the  bugs.  One  point  was  given 
for  locating  the  error  and  two  points  were  given  for  correcting 
the  bug,  yielding  a  maximum  score  of  10  points. 


Subjects  with  the  UNIVAC  COBOL  compiler  listing  had  an  average  of 
6.6  points  while  those  with  the  improved  messages  had  an  average 
of  8.5  points.  A  t-test  yielded  a  significant  difference  at  the 
5%  level. 


The  results  of  this  pilot  study  should  be  considered  exploratory. 
Replications  should  be  performed  with  other  messages, 
professional  subjects,  and  different  languages.  A  more  realistic 
study  could  be  performed  if  two  versions  of  the  same  language 
compiler  were  available.  One  group  of  subjects  would  be  required 
to  work  with  the  standard  version  and  the  other  group  of  subjects 
would  work  with  the  improved  message  version.  Capturing 
performance  in  actual  projects  over  longer  time  frames  could 
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demonstrate  the  true  impact  of  improved  messages 


2.2  COBOL  Compiler  Messages:  Tone  and  Specif ity 

2.3  Presence  or  Absence  of  Text  Editor  Messages 

2.4  Tone  and  Content  of  Text  Editor  Messages* 

2.5  Job  Control  Language  Messages 
3.  CONCLUSIONS 

REFERENCES 
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1.  Introduction 

Language  designers  and  language  proponents  are  often  given 
to  making  claims  about  the  "readability*"  "debug-ability*" 
"understandability* "  "learnability* "  "naturalness*"  etc.  of  a 
)  particular  programming  language.  For  the  most  part  such 
claims  are  psychological  in  nature*  and  thus  open  to  empirical 
inquiry.  The  problem  is  that  thi»  type  of  research  is  difficult 
to  carry  out  and*  frankly*  only  lip  service  (and  "lip  resources") 
to  its  need  is  given  by  the  computing  community.  Moreover*  with 
the  major  push  behind  Ada  and  methodologies  appropriate  to  large 
Ttcale  software  development*  the  needs  of  novice  programmers  have 
gotten  particularly  short  shrift.  We  increasingly  see  the 
attitude  that  a  "programmer"  is  a  '  person  who  works  on  a  100 
person  team  on  some  massive  project  — 7  not  someone  tailoring 
their  home  "mail  network"  or  interacting  with  a  computerixed  — — 
"programmable"  toy.  This  view  of  programming  seems  a  bit 
narrow. 

With  that  introductory  polemic*  let  us  turn  to  the  specifics 
of  our  presentation.  We  have  been  looking  at  how  novice  Pascal 
users  cope  with  problem  solving  in  Pascal.  <!>  In  this  extended 
abstract  we  shall  first  highlight  several  pascal  constructs  which 
are  particularly  troublesome.  Next*  we  shall  make  a  more  general 
statement*  based  also  on  empirical  data*  on  the  need  to  keep 
procedurality  in  programming  languages. 


II.  g^rformance  Analusis:  Read  j/Process  i  ^s*_  Process  i/Read 

Consider  problem  3  in  Table  1.  For  this  problem,  the 
stylistically  correct  solution  in  Pascal  requires  a  curious 
coding  structure: 

read  first-value 
while  (test  ith  value) 
process  ith  value 
read  next-ith  value 

The  loop  must  not  be  executed  if  the  test  variable  has  the 
Specified  value*  and  this  value  could  turn  up  on  the  first  read* 
thus*  a  read  outside  the  loop  is  necessary  in  order  to  "get  the 
thing  going. "  However*  this  results  in  the  loop  processing  being 
"behind  the  read;  it  processes  the  ith  input  and  then  fetches 
the  next— i.  We  call  this  structure  "process  i/read  next— i,  " 


■ClJ  One  goal  of  our  project*  which  will  not  be  reported  on  in 
this  summary*  is  to  build  a  Run-Time  Support  Environment  for 
novice  Pascal  users.  This  system*  components  of  which  are 
Currently  being  built*  will  attempt  to  catch  run-time  bugs  (not 
compile  time  errors*  which  are  adequately  handled  in  other 
Systems)  in  students'  programs*  and  provide  remediation  with 
respect  to  the  underlying  mental  misconceptions. 


Problf  1*  U^iU  •  pro(r«i  Mhloh  rtMs  tO  InUftri  and  than 
printa  out  tha  avaratt«  ^aMbtr ,  th#  avtrac*  of  a 
aariaa  of  nunbara  la  tha  atai  of  thoaa  nuabara  dlvidad 
by  how  aany  nuibara  thara  ara  in  tha  aariaa. 

frobian  2.  Wflta  a  proiraa  which  rapaattdly  raada  In 

tntagara  until  thalr  avai  la  graatar  than  100.  After 
raaehlAg  100.  tha  profroi  ahould  print  out  tha  avaraca 
of  tha  iQtaitra  antarad. 

Probliw  2*  Write  a  prograa  which  rapaattdly  raada  In 

Intagcra  until  It  rtads  tha  inttg ar  99999*  After 
•tains  99999*  It  ahould  print  out  the  corrtet  avaraga. 
That  la,  la  ahould  not  count  tha  final  99999* 


tibia  Probltnn  uaad  In  our  test  Inatruitnt.  These  problana 
««ra  given  to  an  Introductory  progmaing  elasa^on  the  last  day 
of  the  couraa.  they  art  designed  to  test  student  Icnowltoga  of 
Kty  diffarencts  batwatn  different  loop  constructs  In  Pascal. 


prograo  Studant7_^Problea3. 

VST  N,  Sun.  X  :  Idtagar ; 
Avaraga  :  real; 

Stop  :  boolean; 


Stop  ;«  false; 

N  :>  0; 

Sun  :«  0; 
while  not  Stop  do 
begin 
Head  (X); 

U  X  «  99999 

then  Stop  :»  true 
else  begin 

Sun  ;«  Sun  ♦  X; 
N  :>  N  ♦  1 


end 

end;  . 

Average  :«  Sum  /  N; 
Wrltcln  (Average) 
end , 


progrin  Studentd^PreblenS; 


program  Student  l6_^Problen3; 


var  Cduit.  Sun.  Nunber  :  Integer ;  Average  :  real ; 


var  Count.  Sum,  Nun  :  integer ;  Average  :  real; 


Count  :«  0; 

Sum  Is  0; 

Head  (Nunber); 
wniU  Nunber  <>  99999  ^ 
begin 

Sum  Is  Sun  ♦  Nunber; 
Count  :»  Couit  ♦  1 : 
Head  (Nunber) 
end; 

Average  iT" Sun  /  Count; 
Wrltaln  (Average) 
end . 


figure  I  A  stylistically  correct  solution  to  problem  3  in  table 
1.  Note  *.he  nsec  for  two  Head  calls  ana  the  curious  "process  the 
last  ^alue,  read  the  next  value"  semantics  of  the  loop  body, 
‘this  program  was  minimally  edited  for  presentation  here . 
Stuoenta  wrote  these  programa  in  a  clasaroon.  They  were  never 
subnitted  to  a  translator. 


begin 

Count  :»  *1 : 
Sun  Is  0; 

OiUl 


Count  ;s  Count  ♦  1 ; 
Read  (Nun); 


Sun  ;s  Sun  ♦  hum 
until  hum  s  99999: 

Sutp  Is  GuiB  -  99999: 
Average  is  Sun  /  Count 
end . 


Figure  'These  progrims  are  attempts  at  problco  3  cescrioea  .n 
table  1.  They  are  typical  of  the  contortions  students  will 
triTougt,  to  make  this  probler  fall  into  a  "read  a  value,  process 
that  value”  Frame.  Thr  se  programs  have  been  ^irirall>  eciten  lor 
presentation  here.  Students  wrote  these  prograr-s  ir  a  classroom. 
They  were  never  sjomittcd  to  a  translator. 


!  Read  1/Procaaa  i 

Process  1/Read  Ib'ext^i  ‘  Ocher  i 

1  u»ed 

> 

UMS 

1  repeat  loop  i  while  looD 

other 

repvat  loop  ,  while  loop  , 

1  1  — — 

!  j 

Correct  j  A  1  2 

i 

i  1 

I  ■  -  1 

’  1  > 

! 

‘  2  *  i  : 

1  , 

.  1 

'  '  i 

!  1  1 

Incorrect  ^  3  '  3  1 

I  1 

<  1  ( 

1  1 

I  1 

1  ! 

>  4  1 

i  ! 

1  t 

! _ 1 

1  1 

•  \  1 

I  : . ! 

1 

i  !  i 

Table 


Ti.«  nuaberii  In  thl.  ubt.  c.ftr  to  ch«  .ctual  nuabtr  o: 
not  parcentagea. 
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Ona  of  tha  authors  ***■  tha  ona  with  less  Pascal  exparienca  *-** 
intuitivaly  fait  this  coding  strategy  to  be  unnecessarily  awkward 
and  downright  confusing.  Perhaps  a  more  "natural"  coding 
strategy  would  be  to  read  tha  ith  value  and  then  process  iti  we 
call  this  tha  "read  i/procass  i"  coding  strategy.  Others  have 
noticed  this  problem  before#  but  treated  it  largely  as  a  coding 
inconvianca.  Their  response  was  baroque  looping  constructs  which 
eliminated  writing  the  same  code  twice.  We  are  not  as  concerned 
with  elegance  as  with  learnabilitu.  Do  novice  programmers  use 
the  stylistically  correct  coding  strategy  <process  i/read 
next^'i)#  or  do  they  add  extra  machinery  to  a  while  or  repeat  loop 
<e.  g.#  an  embedded  iJi  test  tied  to  a  boolean  variable)  in  order 
to  force  the  code  into  a  read  i/process  i  structure? 

Table  2  lists  the  performance  of  those  students  who 
attempted  tha  problem  with  either  a  y|j>ile  or  repeat  loop.  Of  the 
9  who  solved  it  correctly#  only  2  used  the  stylistically  correct 
"process  i/read  next'-i"  coding  strategy.  (Sea  Figure  1  for  a 
Solution  using  this  coding  strategy. )  In  order  to  correctly 
Solve  the  problem  using  either  a  repeat  or  while  loop  and  the 
read  i/process  i  coding  strategy  requires  extra  machinery# 
Figure  2  shows  student  programs  which  use  this  strategy. 
Nonetheless#  the  vast  majority  of  students  attempted  this 
Solution#  given  the  extra  complexity  needed  for  a  correct 
solution#  it  is  not  surprising  that  many  failed. 

It  is  tempting  to  conclude  that  with  respect  to  these  types 
of  problems#  Pascal  requires  that  students  circumvent  their 
"natural"  problem  solving  intuitions.  Before  we  can  actually 
assert  this  conclusion#  more  research  needs  to  be  done  <1>.  But# 
since  we  must  live  with  Pascal  for  some  period  of  time  to  come# 
it  would  only  be  responsible  for  teachers  to  explicitlu  teach 
their  students  about  this  peculiar  coding  strategy. 


<!>  We  have  designed  and  pilot-tested  the  following  experiment: 
We  first  ask  all  students  to  write  a  plan  or  design  for  problem  3 
in  Table  1  (the  same  one  examined  in  this  section)#  in  a  language 
other  than  a  programming  language.  We  then  ask  half  the  students 
to  write  the  program  in  Pascal.  For  the  other  half  of  the  group# 
we  provide  a  one  page  description  of  constrained  version  of  the 
Ada  loop  ...  end  loop  construct  in  which  only  one  exit  from  the 
loop  body  is  allowed.  While  the  sample  size  was  small  (13 
students  in  total)#  the  data  is  suggestive:  invariably  the  plan 
of  the  students  was  Worded  in  terms  of  a  read  i/process  i. 
However#  the  Pascal  versions  were  typically  coded  with  a  process 
i/read  next-i  strategy.  But#  those  programs  written  using  the 
Ada  loop  ...  end  loop  were  coded  using  the  read  i/process  i 
strategy.  Thus#  the  program  coded  in  Ada  more  closely  matched 
the  students'  plans  than  did  those  program  coded  in  Pascal.  We 
plan  to  run  this  experiment  on  a  larger  group. 
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XXI.  E.8x£Biiiian&g,  gpUimi  a.  Vaiu^ 

In  all  3  problams  <Table  1)«  a  correct  solution  required 
that  the  program  "get  a  new  value  with  a  read. ”  23%  of  all  the 
student  written  p.rograms  did  not  perform  this  function  correctly. 
Often  students  try  to  get  the  previous  or  next  value  of  a 
variable  by  subtracting  or  adding  one  (see  Figure  3).  {!>  We 
also  found  programs  in  which  we  felt  students  assumed  that  each 
use  of  Next^yalue  automaticallu  retrieved  a  new  value. 

As  "expert  programmers"  we  have  a  great  deal  of  deep 
knowledge  about  how  to  program.  In  particular*  we  know  that 
variables  have  not  just  types*  but  also  roles.  Different  coding 
strategies  are  needed  to  realize  like  operations  on  variables 
whose  roles  are  different.  For  example*  "getting  the  next  value" 
implies  adding  one  for  a  counter  variable*  reading  for  a 
New^value  variable*  and  adding  in  the  Newjvalue  for  a 
Running.total  variable.  (The  problems  in  Table  1  need  one 
variable  in  each  of  these  roles.  )  Perhaps  students  committing 
the  above  errors  did  not  understand  or  garbled  these  difforent 
variable  roles. 

Misunderstanding  this  "deep"  knowledge  about  Pascal  —  ‘  mind 
bugs  —  could  result  in  many  different  student  errors  —  surface 
bugs.  Perhaps  students  committing  the  above  errors  did  not 
understand  that  read  is  actually  just  a  special  case  of 
assignment. .  If  so*  then  a  language  which  treated  I/O  calls  as 
special  values  which  can  be  assigned  "to"  or  "from"  might  be  more 
palatable  to  beginning  programmers*  a.  g.  * 

New^value  ;■  Read_from_terminal*  or* 

Write_to_terminal  :*  Running^sum  /  Count. 


Another  possible  mind  bug  which  could  result  in  some  of  the 
observed  errors  would  be  that  students  incorrectly 
overgeneralized  from  the  Counter  variable.  That  is*  since  the 
next  value  of  a  variable  functioning  as  a  counter  can  be 
retrieved  by  simply  adding  a  1  to  the  variable*  why  not  get  the 
next  value  of  any  variable  by  simply  adding  a  1  to  it!  While 
reasonable*  this  is  incorrect. 


XV.  p,erfgrj?»9nc.g  ftnplmiaj;,  IbJi  !!Pemoa^  la  while  looo-  test. 

Based  on  our  examination  of  student  programs*  and  on 
analysis  of  audio-taped*  individual  interviews*  we  felt  that 
there  was  a  great  deal  of  confusion  surrounding  the  time  at  which 
the  terminating  test  in  the  while  loop  gets  evaluated:  is  it 


<1>  "Baching  up"  may  be  needed  when  a  student  does  problem  3  in 
table  1  with  a  read  i/process  i  strategy. 
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Studtnt30^Pro&X«2: 


X*  Su».  5cort 


H««n  :  real: 


protraa  Student 19_?roblt«l : 


begin 
N  ;•  0; 

ou«  :•  0; 

Score  :•  0; 

»*MXe  iSum  <u  100)  oo 
begin 

Score  : »  Score  ^  i ; 
3u«  ;t  Sue  ♦  Score; 
N  :>  N  ♦  t 


eno; 

Mien  :t  5uo  /  j<; 

’•riteln  (’the  «een  e  MeenMO:IO) 
eno . 


ver  NuB}  Pree^nuit  Count  c-  integer; 


Count  :«  0; 

Reed  (Nua): 

Sua  :*  0; 
repeat 

Prev^niiB  :=  Nu«  -  i; 

Sua  7*  Null  ♦  Prev^nui; 

Sub  :s  Sua  ♦  1; 

Count  :s  Coutt  «•  i ; 
until  Count  s  tO; 

Average  :<  Su«  /  Count; 

Wrlteln  (’Average  of  ten  Integers  Is  equal  to  •:2) 
eno. 


Figure  y  these  prograns  are  atteapts  at  the  problems  described 
in  table  1,  they  illustrate  student  problcas  with  getting  a 
Sew^value,  These  prograos  have  been  ninlaally  edited  for 
presentation  here.  Students  wrote  these  progress  in  a  classroom, 
they  were  never  subaitted  to  a  translator. 


Erfifeita  JLL 


Oivtn  th«  followins  •tatMtnt: 

••At  th#  ls%t  company  cocktail  party*  for  evorg  6  pooplo  who  drank 
hard  llquour#  thoro  wort  11  pooplo  who  drank  httr.  ** 

Writ#  a  computtr  program  in  SASIC  which  will  output  the  number  of 
beer  drinkers  when  supplied  (via  user  input  at  the  terminal)  with  the 
number  of  hard  liciuour  drinkere.  Use  H  for  the  number  of  people  who 
drank  hard  liquour*  and  8  for  the  number  of  people  who  drank  beer. 


Sample  Size  X  Correct  X  Incorrect 

93  69  31 

Rratiia  Rl 

Given  the  following  statement: 

•*At  the  laet  compeny  cocktail  party*  for  every  6  people  who  drank 
hard  liquour*  there  were  11  people  who  drank  beer,  ^ 

Mrite  en  equetion  which  represents  the  above  statement.  Use  H  for  the 
number  of  people  who  drank  hard  liquour*  and  B  for  the  number  of 
people  who  drank  beer. 

Semple  Size  X  Correct  X  Incorrect 

5i  45  M 


Probability  of  these  results  on  tho  assumption  that 
problem  were  equally  likely  is  p  <  ,05 


Table  i 


errors 


on  each 
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evaluated  once*  at  the  top  of  the  loop*  or  is  the  test 
continually  evaluated  during  the  execution  of  the  body  of  the 
loop?  The  program  given  below  was  also  on  a  written  test  taken 
hy  the  31  summer  school  students. 

program  ProblemA) 

Count  :  intygyr* 

Count  :<■  Ot 
mlliijl  Count  <  7  jla 
k£JLla 

Writeln  <'*')> 

Count  Count  <4*  1) 

Writeln  ('/') 
fiHi 

siuL. 


If  the  students  felt  that  the  terminating  test  was  evaluated 
c.ontinuallu*  then  the  loop  ihgyjjl  terminate  an  '/•  were 
printed*  thus  providing  one  more  ***  and  */*.  <1>  In  otherwords* 
it  is  as  if  the  test  were  a  "demon"  watching  the  statements  in 
the  loop  body*  and  waiting  for  its  condition  to  become  true.  Of 
the  31  students*  34%  mads  the  above  mistake.  Given  the  ubiquity 
af  the  while  construct  in  programs  and  in  the  instruction*  and 
given  the  lateness  in  the  course  (the  end  of  the  semester)*  we 
felt  that  this  was  a  surprisingly  high  percentage. 

We  feel  that  the  basis  for  this  confusion  is  grounded  in  the 
mismatch  between  the  semantics  of  while  in  a  programming  language 
Context*  and  the  semantics  —  the  meaning  —  of  'while'  in  "every 
day  experience."  In  the  latter  case*  'while'  has  a  global  sense: 
Aurina  the  course  of  some  event.  In  contrast*  the  programming 
language  while  requires  a  local*  narrow  interpretation:  at  a 
Specific  point  in  time.  Clearly*  the  names  of  programming 
language  constructs  must  rely  on  real  World  semantics  of  their 
analogs.  However*  care  ought  to  be  exercised  in  their  selection. 
Again*  we  are  unlikely  to  change  Pascal  or  the  while  loop 
construct*  but  educators  must  take  note  of  this  error*  and  pay 
attention  to  it  in  their  instruction. 


V.  XtLB.  fcIfiAi  £fii:  Proceduralitu  jji  Languages  £££.  Novices 


<!>  We  were  not  interested  in  the  actual  number  of  and  '/'* 

i.  e.  *  we  were  not  studying  the  "off--by*-one"  bug  in  this 
particular  problem. 


131 


Workshop  Tho  Hunan  Conputar  Intarface 


Thara  it  a  dafinita  trand  in  programming  langauga  design  and 
programming  mathodology  .  towards  more  "formality.  "  For  example# 
"logic"  and  production  rules  have  been  seriously  suggested  as 
progamming  languages.  Dijsktra  suggests  that  the  process  of 
Writing  a  program  should  be  akin  to  that  of  writing  a 
mathematical  proof.  Backus'  new  language  takes  a  different#  yet 
similar  approach:  take  procadurality  out  of  the  programming 
language  and  make  it  algebra  based  to  facilitate  program  proofs. 
While  these  langauges  and  approaches  might  be  appropriate  for 
experts#  we  are  quite  skeptical  of  their  appropriateness  for 
novices.  We  are  seriously  concerned  that  programming  not  be 
equated  with  mathmatics.  For  whatever  reasons*  most  people  have 
a  great  deal  of  trouble  learning  and  using  mathematics.  We 
believe#  and  we  are  not  alone#  that  there  are  aspects  of 
programming  which  uniquely  lend  themselves  to  the  demystification 
of  mathematics.  The  formal  programming  people  propose  to  remove 
exactly  those  aspects  of  programming  while  increasing  required 
math  ability.  In  our  increasingly  sophisticated  world#  just 
plain  folks  will  need  to  "program"#  and  our  formal  programming 
friends  have  no  answers  for  these  non-professional  programmers. 
We  are  not  willing  to  write  off  just  plain  folks. 

In  the  following#  we  take  a  le&s  polemical#  and  more 
evidence  based  look  at  one  of  the  "unique  aspects  of  programming" 
alluded  to  above#  namely#  procedural ity. 


Procedural  Nffn-PrM.ctfwrgl;  That  is.  ihS 

The  first  study  which  we  feel  supports  the  need  to  keep 
procedural ity  in  programming  languages  for  novices  was  done  by 
Welty  and  Stemple  £19813.  They  compared  the  ability  of  novice 
subjects  to  write  database  queries  in  languages  with  different 
amounts  of  procadurality.  In  all  issues  except  procedurality# 
the  languages  were  identical.  A  typical  query  in  SQL#  the  less 
procedural  language#  is: 

.  SELECT  NAME 

FROM  STUDENTTABLE 
WHERE  HOMESTATE  »  'OHIO' 

The  equivalent  query  in  TABLET#  the  more  procedural  language#  is: 

FORM  OHIOANS  FROM  NAME#  HOMESTATE  OF  STUDENTTABLE 
KEEP  ROWS  WHERE  HOMESTATE  •  'OHIO'- 
'  PRINT  NAME 

In  their  paper  they  formalize  "amount  of  procedurality"  based  on 
the  number  of  variables#  the  number  of  operations#  and  the  degree 
to  which  the  bindings  and  operations  are  ordered  by  the  language 
semantics.  The  two  languages  were  learned  by  subjects  working 
largely  on  their  own.  The  same  example  problems  and  order  of 
presentation  was  used  for  each  group.  The  experiment  showed  that 
subjects  who  learned  the  more  procedural  query  language#  TABLET# 
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wrot«  difficult  qutri'tt  b*ttvT  than  those  using  the  less 
procedural  language  SOL. 

The  second  study  which  we  feel  supports  our  claim  is  being 
carried  out  by  Soloway  and  his  colleagues  at  UMASS.  In  our  work* 
****  explored  the  performance  of  students  on  "ratio”  type  word 
problems.  Typically#  half  the  students  in  a  low*-level 
programming  class  were  asked  to  solve  a  word  problem  with  an 
algebraic  equation#  while  the  other  half  were  asked  to  solve  the 
same  problem  with  a  program  <Table  3).  As  the  results  indicate# 
significantly  more  students  got  the  problem  correct  in  the  the 
programming  context  than  did  those  in  the  algebraic  context.  A 
number  of  these  experiments  have  been  run  in  which  various 
paramters  were  varied  (e.  g.  #  problem  wording).  In  all  cases  the 
results  were  similar  to  those  in  Table  3. 

We  have  a  number  of  specific  hypotheses  which  could  account 
for  this^  performance  difference.  The  basis  for  all  of  them# 
however#  is  procedural itu.  Some  students  who  used  algebra  as  the 
Solution  language  seemed  to  view  the  equation  as  a  "picture 
description: "  there  are  more  beer  drinkers  than  hard  liquour 
drinkers#  thus  IIB#  which  represents  the  beer  drinkers#  is 
related  to  6H#  the  hard  liquour  drinkers#  via  IIB  »  6H. 
Alternatively#  some  students  viewed  the  algebraic  equation  as 
"label  descriptors#”  much  like  "3ft.  lyd.  ”  <l>  On  the  other 

hand#  programming  appears  to  encourage  students  to  view  the 
equation  as  an  operation#  or  transformation.  That  is#  the 

that  variables  have  values#  and  that  variables  are  acted 
Upon  by  operations#  appear. more  understandable  to  students  in  the 
Programming  environment.  Thus«  the  procedural  nature  of 

Programming  seems  to  be  a  key  factor  in  understanding  and  using 
such  basic  concepts  as  variable#  operation#  equal  sign. 


Remarks 


Clearly#  this  note  is  only  a  "teaseri "  a  fuller  discussion 
of  these  issues  must  await  the  workshop.  We  genuinely  solicit 
your  comments#  and  look  forward  to  an  active  interchange  at  the 
workshop. 


<1>  These  hypotheses  are  based  on  the  analysis  of  many  hours  of 
video-taped  clinical  interviews  with  individual  students  as  they 
solved  problems  of  the  above  sort. 
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steamer:  An  Advanced  Computer  Aided  Instruction 
System  For  Teaching  Propulsion  Engineering 
Albert  L.  Stevens 
Michael  D.  Williams 
James  D.  Hollan 

In  this  presentation,  we  describe  the  current  state 
of  Steamer,  an  intelligent  CAI  system  with  a  graphics- 
based  human  interface.  Steamer  includes  a  math  model  of 
a  steam  plant,  an  interactive  graphics  front  end  and  a 
qualitative  modelling  component.  The  math  model  and 
graphics  interface  allows  the  student  to  control  and 
observe  a  simulated  steam  plant.  The  qualitative  model¬ 
ling  component  enables  Steamer  to  explain  in  casual 
terms  the  operation  of  components  and  subsystems.  The 
design  of  the  graphics  interface  is  based  on  object- 
oriented  programming  to  allow  much  more  modularity  and 
flexibility  than  is  normal  with  computer  graphics.  The 
qualitative  modelling  component  is  based  on  incremental 
qualitative  simulation  to  model  systems  in  terms  of 
psychologically  meaningful  events. 
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METAMORPHOSIS  THROUGH  METAPHOR 


J.C.  THOMAS 
IBM  CHQ  Armonk,NY 

The  problems  that  mankind  faces  in  the  twentieth  century 
sometimes  seem  insurmountable.  Nuclear  weapons,  the 
population  explosion,  rising  demand  and  falling  levels  of 
most  natural  resources  provide  a  potentially  devastating 
combination.  In  addition,  our  new  lifestyles  have  provided 
a  number  of  unwelcome  ecological  surprises. 

The  organism  and  the  environment  are  necessarily  in  an 
intimate  relationship.  Yet,  we  humans  are,  seemingly  by 
choice,  changing  our  environment  much  faster  than  we  can 
adapt  biologically.  It  seems  suicidal. 

The  only  major  way  out  of  these  dilemmas  is  for  effective 
human  intelligence  to  increase  dramatically  over  the  next 
century.  This  could  theoretically  be  accomplished 
biochemically,  educationally,  or  through  more  effective  group 
problem  solving  procedures. 

The  fourth  possibility,  which  is  addressed  in  this 
paper,  is  that  of  the  computer  augmenting  effective  human 
intelligence.  By  augmenting  effective  human  intelligence  I 
mean  that  by  using  a  computer,  people  will  operate  so  as  to 
bring  greater  short  and  long  term  happiness  to  themselves,  to 
mankind,  and  to  life  than  they  will  without  the  computer. 
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The  major  obstacle  to  this  goal  is  not  the  lack  of 
progress  in  computer  technology:  we  are  able  to  build 
smaller, faster,  cheaper  components.  (That  progress,  of 
course,  is  what  enables  us  to  address  the  next  problem). 

What  we  have  been  slow  to  achieve  is  a  computer  that  is 
anything  near  optimally  designed  to  help  a  human  being  do  a 
more  effective,  higher  quality  job.  In  order  to  accomplish 
this  latter  goal,  we  need  some  notion  of  what  humans  can  do, 
what  they  need  to  be  able  to  do  better  in  order  to  solve 
their  problems  and  what  the  capabilities  of  the  computer  are. 


In  this  paper,  I  will  focus  on  part  of  this  problem.  First, 
I  will  present  a  model  of  how  the  person  approaches  and 
learns  to  use  a  new  tool.  Second,  I  will  point  out  where  in 
this  process  there  is  likely  to  be  a  critical  breakdown 
which  prevents  the  person  from  using  the  tool  in  an  effective 
fashion  (e.g.,  to  solve  previously  insoluble  problems). 

Third, I  will  present  a  theory  of  what  the  tool  should  look 
like  and  provide  some  suggestively  supporting  evidence  based 
on  experimental  work  of  my  own  and  of  other  investigators. 

Fourth,  in  the  area  of  office  systems,  I  will  present  some 
examples  of  how  my  recommendations  might  be  implemented. 
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The  model  of  mind  is  multi -viewed;  at  the  current  state 
of  integration  of  behavioral  science  no  single  view  (e.g., 
behavioristic  or  cognitive)  provides  as  sufficient  a  scope  as 
does  a  multi -viewed  approach. 

The  presented  model  is  novel  in  the  context  of 
human-computer  interaction  in  the  notion  of  resource 
allocations  with  differentiably  usable  resources,  in  an 
emphasis  upon  motivational  issues,  and  in  the  analysis  of 
primary,  secondary  and  tertiary  memory  limitations. 

The  model  implies  that  under  certain  conditions  a  kind 
of  "gambler’s  ruin"  phenomenon  will  occur  in  which  the 
aspiring  learner  of  a  potentially  useful  system  will  stop 
short.  An  even  more  common  case  of  essentially  the  same 
phenomenon  will  occur  among  those  learners  who  learn  enough 
about  the  system  to  do  what  they  did  before  only  marginally 
better.  Rarely,  a  user  will  learn  an  interface  so  that  they 
are  truly  facile  with  the  facilities. 

Still  rarer  are  cases  in  which  the  computer-tool  allows 
a  qualitative  change  in  the  user's  work.  Yet  for  augmenting 
effective  human  intelligence,  it  is  this  last  category  that 
we  would  like  to  contain  the  majority  of  users.  For  such  a 
qualitative  change  to  occur,  the  interface  must  be  designed 
to  allow  a  more  optimal  allocation  of  the  user's 
psychological  resources . 
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One  way  of  accomplishing  this  latter  goal  is  through  the 
use  of  an  appropriate  metaphorical  interface  presented  to  the 
user  along  with  a  well  though-out  mapping  inside  the  computer 
system  that  translates  the  actions  the  user  takes  in  the 
metaphorical  space  into  the  appropriate  state  changes  in  the 
machine,  and  translates  the  machine  state  changes  into  the 
appropriate  presentations  in  the  user's  metaphor. 

A  large  body  of  empirical  evidence  strongly  suggests 
that  "meaningful"  material  can  greatly  affect  the  user's 
performance  quantitatively  and  in  some  cases  qualitatively. 
"Meaningfulness"  can  exist  at  many  levels.  Editing  commands 
that  are  more  English-like  are  better  than  their 
abbreviational  counterparts  (Ledgard,  et  als  (1980). 
Non-programmers  can  learn  an  English-like  query  language 
better  than  its  symbolic  counterpart  (Reisner,  1975).  Older 
subjects  particularly,  but  younger  ones  as  well,  are  aided  in 
learning  by  the  addition  of  "extra"  mnemonic  material  (Thomas 
&  Rubin,  1972). 

The  implications  of  these  findings  for  a  particular 
domain  -  office  systems  is  drawn  in  some  detail.  A  number  of 
objects,  organizing  schemes,  features,  and  actions  that 
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people  are  familiar  with  are  reviewed  along  with  the  way  in 
which  these  can  be  combines  to  let  the  user  know  what  is 
going  on.  The  model  explains  how  using  such  metaphors  can 
increase  comprehension,  motivation,  and  performance  of  given 
tasks  and  how  such  metaphors  can  be  used  to  improve  the 
effective  intelligence  that  goes  into  the  user’s  solutions. 

In  addition  to  using  metaphors,  a  better  allocation  of 
the  user's  psychological  resources  can  be  achieved  by  making 
more  complete  use  of  various  input  and  output 
characteristics  of  human  beings.  People  can  discriminate 
better  when  information  is  presented  on  a  large  number  of 
channels  (rather  than  a  single  channel).  People  can  also 
output  at  greater  data  rates  over  several  channels.  In 
traditional,  pencil  and  paper  editing,  non-verbal,  spatial 
!;ymbols  are  used  as  the  metalanguage  for  the  verbal 
material.  In  film  directing,  on  the  other  hand,  much  of  the 
metalanguage  is  verbal.  We  need  to  become  more  sensitive  to 
this  kind  of  "division  of  labor"  in  our  computer  interfaces. 


139 


A  SYSTEM  FOR  COMPUTER  AIDED 
MEMORIZATION 


Michael  D.  Williams 
Xerox  Palo  Alto  Research  Center 
Palo  Alto,  California 

James  D.  Hollan 

Navy  Personnel  Research  and  Development  Center 
San  Diego,  California 
and 

University  of  California,  San  Diego 
La  Jolla,  California 


We  are  constructing  an  intelligent  computer  based  instructional  system  to  facillitate  students  in 
the  memorization  of  a  large  collection  of  facts.  The  system  consists  of  a  series  of  games  played  on  a 
microprocessor,  a  relational  data  base  to  drive  the  games,  a  student  model,  and  a  computer  coach. 
To  the  student  the  system  appears  as  a  series  of  games  played  with  a  table  top  computer  against  a 
computerized  opponent.  Example  games  are  twenty  questions,  flash  cards,  a  property  specification 
game  where  students  successively  enhance  the  definition  of  an  object  until  one  or  no  objects  match 
the  cumulative  description,  a  picture  recognition  game,  and  a  concentration-like  table  fill-in  game. 
The  data  base  can  be  modified  to  allow  a  variety  of  topic  matters.  Present  data  bases  include  US 
and  Russian  ships,  their  radars  and  weapons,  South  American  geography,  the  anatomy  of  the 
human  hand,  and  a  fantasy  data  base  on  star  trek  trivia.  The  student  model  consists  of  a  simple 
marking  of  the  relations  in  the  data  base.  The  computer  coach  consists  of  a  series  of  opponents  of 
variable  "intelligence"  and  a  scheme  for  focusing  game  activity  on  portions  of  the  data  base  where 
the  student  is  weak  and  the  information  important. 

Our  principle  student  population  are  Naval  Officers  learning  the  properties  of  Russian  ships, 
radars,  and  weapons.  The  data  base  they  are  attempting  to  master  consists  of  thousands  of  facts. 
Approximately  3  and  1/2  weeks  of  a  6  week  course  on  tactical  decision  making  arc  taken  up  with 
lectures,  practice,  and  tests  to  support  this  memorization. 

Our  primary  scientific  goal  in  this  work  is  to  explore  the  process  of  remembering.  We  arc 
using  this  computerized  memorization  system  as  a  tool  to  gather  data  as  well  as  a  forcing  function 
to  drive  the  development  of  of  our  theory.  An  issue  that  anyone  building  a  computerized 
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instructional  system  must  confront  is  what  information  to  present  a  student  and  when  to  present  the 
information.  The  goal  for  our  theory  of  remembering  is  to  determine  the  implications  of  learning 
any  particular  piece  of  information  with  regard  to  the  durability  of  what  the  student  knows, 
flexibility  of  retreival,  errors  in  recall,  incidential  information  recovered,  and  speed  of  retrieval. 

We  come  to  the  problem  with  the  view  that  remembering  is  a  complex  process  of 
reconstruction  from  an  array  of  fragments.  An  essential  observation  is  that  people  memorize  more 
than  just  the  facts  in  the  data  base.  A  large  amount  of  their  learning  seems  to  focus  around 
abstractions  drawn,  in  part,  from  the  regularities  within  the  data  base.  Thus,  a  student  might  notice 
that  all  ships  which  carry  a  scoop-pair  radar  also  carry  shaddock  missiles  (this  is  because  the  scoop- 
pair  radar  is  the  guidence  radar  used  to  control  that  particular  missile,  it  has  no  function  without 
the  missile).  In  effect,  students  seem  to  be  building  a  "theory"  of  the  data  base  from  which  they 
can  reconstruct  the  portion  they  need  to  answer  any  given  query.  Given  tliat  this  is  the  case,  what 
we  are  looking  for  are  the  particular  mnemonic  effects  of  these  "abstractions",  and  principled 
reasons  for  these  effects  within  a  reconstructive  theory  or  remembering. 

Our  primary  engineering  goal  in  this  work  is  to  build  a  system  which  provides  substantial 
facilitation  to  students  who  must  memorize  some  collection  of  facts.  In  this  role  we  are  investing 
substantial  efforts  in  what  we  call  the  pragmatics  of  the  system  design.  Thus  we  are  using  computer 
games  to  enhance  motivation,  have  spent  large  amounts  of  time  designing  and  tuning  the  interface 
betweeen  student  and  machine,  and  are  using  a  technique  of  in  situ  development  to  tunc  the  system 
toward  realistic  user  needs. 
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